Re: [BUG mips llvm] MIPS: malformed R_MIPS_{HI16,LO16} with LLVM

From: Nick Desaulniers
Date: Sat Jan 09 2021 - 12:51:54 EST


On Sat, Jan 9, 2021 at 9:11 AM Alexander Lobakin <alobakin@xxxxx> wrote:
>
> Machine: MIPS32 R2 Big Endian (interAptiv (multi))
>
> While testing MIPS with LLVM, I found a weird and very rare bug with
> MIPS relocs that LLVM emits into kernel modules. It happens on both
> 11.0.0 and latest git snapshot and applies, as I can see, only to
> references to static symbols.
>
> When the kernel loads the module, it allocates a space for every
> section and then manually apply the relocations relative to the
> new address.
>
> Let's say we have a function phy_probe() in drivers/net/phy/libphy.ko.
> It's static and referenced only in phy_register_driver(), where it's
> used to fill callback pointer in a structure.
>
> The real function address after module loading is 0xc06c1444, that
> is observed in its ELF st_value field.
> There are two relocs related to this usage in phy_register_driver():
>
> R_MIPS_HI16 refers to 0x3c010000
> R_MIPS_LO16 refers to 0x24339444
>
> The address of .text is 0xc06b8000. So the destination is calculated
> as follows:
>
> 0x00000000 from hi16;
> 0xffff9444 from lo16 (sign extend as it's always treated as signed);
> 0xc06b8000 from base.
>
> = 0xc06b1444. The value is lower than the real phy_probe() address
> (0xc06c1444) by 0x10000 and is lower than the base address of
> module's .text, so it's 100% incorrect.
>
> This results in:
>
> [ 2.204022] CPU 3 Unable to handle kernel paging request at virtual
> address c06b1444, epc == c06b1444, ra == 803f1090
>
> The correct instructions should be:
>
> R_MIPS_HI16 0x3c010001
> R_MIPS_LO16 0x24339444
>
> so there'll be 0x00010000 from hi16.
>
> I tried to catch those bugs in arch/mips/kernel/module.c (by checking
> if the destination is lower than the base address, which should never
> happen), and seems like I have only 3 such places in libphy.ko (and
> one in nf_tables.ko).
> I don't think it should be handled somehow in mentioned source code
> as it would look rather ugly and may break kernels build with GNU
> stack, which seems to not produce such bad codes.
>
> If I should report this to any other resources, please let me know.
> I chose clang-built-linux and LKML as it may not happen with userland
> (didn't tried to catch).

Thanks for the report. Sounds like we may indeed be producing an
incorrect relocation. This is only seen for big endian triples?

Getting a way for us to deterministically reproduce would be a good
first step. Which config or configs beyond defconfig, and which
relocations specifically are you observing this with?
--
Thanks,
~Nick Desaulniers