Re: Unpatched return thunk in use. This should not happen!

From: Marco Elver
Date: Tue Mar 26 2024 - 15:34:16 EST


On Tue, 26 Mar 2024 at 20:12, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Tue, Mar 26, 2024 at 06:04:26PM +0200, Nikolay Borisov wrote:
> > So this _sub_I_00099_0 is the compiler generated ctors that is likely
> > not patched. What's strange is that when adding debugging code I see that 2
> > ctors are being executed and only the 2nd one fires:
> >
> > [ 7.635418] in do_mod_ctors
> > [ 7.635425] calling 0 ctor 00000000aa7a443a
> > [ 7.635430] called 0 ctor
> > [ 7.635433] calling 1 ctor 00000000fe9d0d54
> > [ 7.635437] ------------[ cut here ]------------
> > [ 7.635441] Unpatched return thunk in use. This should not happen!
>
> ... and this is just the beginning of the rabbit hole. David and I went
> all the way down.
>
> Turns out that objtool runs on the .o files and creates the
> .return_sites just fine but then the module building dance creates an
> intermediary *.mod.c file and when that thing is built, KCSAN would
> cause the addition of *another* constructor to .text.startup in the
> module.
>
> The .o file has one:
>
> -------------------
> Disassembly of section .text.startup:
>
> ...
>
> 0000000000000010 <_sub_I_00099_0>:
> 10: f3 0f 1e fa endbr64
> 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9>
> 15: R_X86_64_PLT32 __tsan_init-0x4
> 19: e9 00 00 00 00 jmp 1e <__UNIQUE_ID___addressable_cryptd_alloc_aead349+0x6>
> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4
> -------------------
>
>
> while the .ko file has two:
>
> -------------------
> Disassembly of section .text.startup:
>
> 0000000000000010 <_sub_I_00099_0>:
> 10: f3 0f 1e fa endbr64
> 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9>
> 15: R_X86_64_PLT32 __tsan_init-0x4
> 19: e9 00 00 00 00 jmp 1e <_sub_I_00099_0+0xe>
> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4
>
> ...
>
> 0000000000000030 <_sub_I_00099_0>:
> 30: f3 0f 1e fa endbr64
> 34: e8 00 00 00 00 call 39 <_sub_I_00099_0+0x9>
> 35: R_X86_64_PLT32 __tsan_init-0x4
> 39: e9 00 00 00 00 jmp 3e <__ksymtab_cryptd_alloc_ahash+0x2>
> 3a: R_X86_64_PLT32 __x86_return_thunk-0x4
> -------------------
>
> Once we've figured that out, finding a fix is easy:

Thanks for figuring this one out!

> diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
> index 8568d256d6fb..79fcf2731686 100644
> --- a/scripts/Makefile.modfinal
> +++ b/scripts/Makefile.modfinal
> @@ -23,7 +23,7 @@ modname = $(notdir $(@:.mod.o=))
> part-of-module = y
>
> quiet_cmd_cc_o_c = CC [M] $@
> - cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI) $(CFLAGS_GCOV), $(c_flags)) -c -o $@ $<
> + cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI) $(CFLAGS_GCOV) $(CFLAGS_KCSAN), $(c_flags)) -c -o $@ $<

This looks reasonable.

> %.mod.o: %.mod.c FORCE
> $(call if_changed_dep,cc_o_c)
>
> However, I'm not sure.
>
> I wanna say that since those are constructors then we don't care about
> dynamic races there so we could exclude them from KCSAN.

Yeah, we can just exclude all the code from the auto-generated mod.c
files from KCSAN instrumentation. It looks like they just contain
global metadata for the module, and no other code. The auto-generated
constructors don't contain much interesting code (unlikely they'd
contain data races we'd ever care about).

> If not, I could disable the warning on KCSAN. I'm thinking no one would
> run KCSAN in production...
>
> A third option would be to make objtool run on .ko files. Yeah, that
> would be fun for Josh. :-P
>
> I'd look into the direction of melver for suggestions here.

I think just removing instrumentation from the mod.c files is very reasonable.

Thanks,
-- Marco