Re: [PATCH v4 0/4] LoongArch: Support new relocation types

From: Youling Tang
Date: Mon Aug 01 2022 - 07:28:51 EST


Hi, Jinyang

On 08/01/2022 06:08 PM, Jinyang He wrote:
On 08/01/2022 05:55 PM, Xi Ruoyao wrote:

On Mon, 2022-08-01 at 10:34 +0800, Huacai Chen wrote:
Hi, all,

On Mon, Aug 1, 2022 at 10:16 AM Youling Tang <tangyouling@xxxxxxxxxxx>
wrote:
Hi, Ruoyao

On 07/30/2022 10:52 AM, Xi Ruoyao wrote:
On Sat, 2022-07-30 at 10:24 +0800, Xi Ruoyao wrote:
On Sat, 2022-07-30 at 01:55 +0800, Xi Ruoyao wrote:
On Fri, 2022-07-29 at 20:19 +0800, Youling Tang wrote:

On 07/29/2022 07:45 PM, Xi Ruoyao wrote:
Hmm... The problem is the "addresses" of per-cpu symbols
are
faked: they
are actually offsets from $r21. So we can't just load
such an
offset
with PCALA addressing.

It looks like we'll need to introduce an attribute for GCC
to
make
an
variable "must be addressed via GOT", and add the
attribute into
PER_CPU_ATTRIBUTES.
Yes, we need a GCC attribute to specify the per-cpu
variable.
GCC patch adding "addr_global" attribute for LoongArch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/599064.html

An experiment to use it:
https://github.com/xry111/linux/commit/c1d5d70
Correction: https://github.com/xry111/linux/commit/c1d5d708

It seems 7-bit SHA is not enough for kernel repo.
If addr_global is rejected or not implemented (for example,
building the
kernel with GCC 12), *I expect* the following hack to work (I've
not
tested it because I'm AFK now). Using visibility in kernel seems
strange, but I think it may make some sense because the modules
are some
sort of similar to an ELF shared object being dlopen()'ed, and our
way
to inject per-CPU symbols is analog to ELF interposition.

arch/loongarch/include/asm/percpu.h:

#if !__has_attribute(__addr_global__) && defined(MODULE)
/* Magically remove "static" for per-CPU variables. */
# define ARCH_NEEDS_WEAK_PER_CPU
/* Force GOT-relocation for per-CPU variables. */
# define PER_CPU_ATTRIBUTES
__attribute__((__visibility__("default")))
#endif

arch/loongarch/Makefile:

# Hack for per-CPU variables, see PER_CPU_ATTRIBUTES in
# include/asm/percpu.h
if (call gcc-does-not-support-addr-global)
KBUILD_CFLAGS_MODULE += -fPIC -fvisibility=hidden
endif

Using the old toolchain (GCC 12) can successfully load the
nf_tables.ko
module after applying the above patch.
I don't like such a hack..., can we consider using old relocation
types when building by old toolchains?

I don't like the hack too. I only developed it as an intellectual game.

We need to consider multiple combinations:

(1) Old GCC + old Binutils. We need -mla-local-with-abs for
KBUILD_CFLAGS_MODULE.

(2) Old GCC + new Binutils. We need -mla-local-with-abs for
KBUILD_CFLAGS_MODULE, *and* adding the support for
R_LARCH_ABS{_HI20,_LO12,64_LO20,64_HI12} in the kernel module loader.

(3) New GCC + old Binutils. As new GCC should support our new attribute
(I now intend to send V2 patch to gcc-patches using "movable" as the
attribute name), no special action is needed.

Basically, we need:

(1) Handle R_LARCH_ABS{_HI20,_LO12,64_LO20,64_HI12} in the kernel module
loader.
(2) Add -Wa,-mla-local-with-abs into KBUILD_CFLAGS_MODULE if GCC version
is <= 12.

Actually, I really hope kernel image is in the XKVRANGE, rather
than being in XKPRANGE. So that we can limit kernel and modules
be in 4GB range. I think it will make all work normally. :-(


Assuming that the kernel and modules are limited to 4G, the external
symbols will be accessed through pcrel32, which means that there is no
need to pass the GOT table entry, and there is no need for got support,
so there will be no percpu problem, and it will make all work normally?

Youling.