Re: [PATCH 1/3] LoongArch: tools: Add relocs tool support

From: Youling Tang
Date: Mon Sep 05 2022 - 22:16:51 EST


Hi, Ruoyao & Jinyang

On 09/05/2022 10:52 AM, Youling Tang wrote:
Hi, Ruoyao

On 09/04/2022 12:53 AM, Xi Ruoyao wrote:
On Sun, 2022-09-04 at 00:23 +0800, Jinyang He wrote:
On 2022/9/3 18:49, Xi Ruoyao wrote:

On Sat, 2022-09-03 at 09:57 +0800, Youling Tang wrote:
Unlike (pre-r6) MIPS, LoongArch has a complete support for PIC, and
currently LoongArch toolchain always produces PIC (except, if
-Wa,-mla-
{local,global}-with-abs or la.abs macros are used explicitly).

So would it be easier to review and correct the uses of "la.abs"
in the
code, and make the main kernel image a real PIE? Then we can
load it
everywhere w/o any need to do relocation at load time.
At the beginning I also wanted to make the main kernel image a real
PIE
and tried it, some of the "la.abs" can be modified, but I encountered
difficulties in modifying the exception handling code part, the
kernel
will not boot after modification :(, I will continue to work hard
try.

I just tried the same thing and get the same result :(. Will spend
several hours reading the LoongArch manual about exception...

The reason is the handler code is not executed in linker address, but
copied elsewhere. Then PC-relative offset is broken. I managed to work
around it by creating a trampoline and jump into the handler, instead of
copy the handler code. Then I could remove most "la.abs" occurrence
(except two in kernel entry point, which seem deliberately used):

- https://github.com/xry111/linux/commit/56a433f
- https://github.com/xry111/linux/commit/48203e6


Thank you very much.

After applying the above two patches and the following modifications,
the relocation can be successful after removing the
apply_r_loongarch_la_rel (for la.abs relocation) implementation. I
tested it in the qemu environment.

--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -113,9 +113,11 @@ SYM_CODE_START(smpboot_entry)
li.d t0, CSR_DMW1_INIT # CA, PLV0
csrwr t0, LOONGARCH_CSR_DMWIN1

- la.abs t0, 0f
- jr t0
-0:
+ li.d t0, CACHE_BASE
+ pcaddi t1, 0
+ or t0, t0, t1
+ jirl zero, t0, 0xc

Youling.

Using the trampoline in handler table will definitely lead to sub-
optimal performance. I just use it as a proof-of-concept. Later we may
use some assembler trick to generate hard-coded handler table with
correct PC-relative offsets.

The following ideas are based on experience, without validation. Patches
show that three types of relocation are needed to be done.
1, GOT is generated by toolchain, so I think eliminating them by
toolchain is better.

https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600797.html

I stop to read the mail here because it's 00:52 AM now :).

2, Ex_table is generated but striped relocation info. We can plays pcrel
way to resolve this problem. One of ways like follows, (pseudo-code)

Switch to relative exception tables:

https://github.com/tangyouling/linux/commit/6525b8da
https://github.com/tangyouling/linux/commit/b6ac0827

Will switch to the relative exception tables after applying the above
two patches. So there is no need to relocate the exception table
(remove relocate_exception_table).

Now we can remove the relocation of la.abs , got and ex_table, but
still need to relocate LARCH_64. Is there anything else that needs to
be modified to eliminate this relocation?

Thanks,
Youling.


/* snip */