Re: Unpatched return thunk in use. This should not happen!

From: Borislav Petkov
Date: Tue Mar 26 2024 - 11:55:22 EST


On Tue, Mar 26, 2024 at 04:08:32PM +0200, Nikolay Borisov wrote:
> So the problem happens when KCSAN=y CONFIG_CONSTRUCTORS is also enabled and
> this results in an indirect call in do_mod_ctors():
>
> mod->ctors[i]();
>
>
> When KCSAN is disabled, do_mod_ctors is empty, hence the warning is not
> printed.

Yeah, KCSAN is doing something weird. I was able to stop the guest when
the warning fires. Here's what I see:

The callstack when it fires:

#0 warn_thunk_thunk () at arch/x86/entry/entry.S:48
#1 0xffffffff811a98f9 in do_mod_ctors (mod=0xffffffffa00052c0) at kernel/module/main.c:2462
#2 do_init_module (mod=mod@entry=0xffffffffa00052c0) at kernel/module/main.c:2535
#3 0xffffffff811ad2e1 in load_module (info=info@entry=0xffffc900004c7dd0, uargs=uargs@entry=0x564c103dd4a0 "", flags=flags@entry=0) at kernel/module/main.c:3001
#4 0xffffffff811ad8ef in init_module_from_file (f=f@entry=0xffff8880151c5d00, uargs=uargs@entry=0x564c103dd4a0 "", flags=flags@entry=0) at kernel/module/main.c:3168
#5 0xffffffff811adade in idempotent_init_module (f=f@entry=0xffff8880151c5d00, uargs=uargs@entry=0x564c103dd4a0 "", flags=flags@entry=0) at kernel/module/main.c:3185
#6 0xffffffff811adec9 in __do_sys_finit_module (flags=0, uargs=0x564c103dd4a0 "", fd=3) at kernel/module/main.c:3206
#7 __se_sys_finit_module (flags=<optimized out>, uargs=94884689990816, fd=3) at kernel/module/main.c:3189
#8 __x64_sys_finit_module (regs=<optimized out>) at kernel/module/main.c:3189
#9 0xffffffff81fccdff in do_syscall_x64 (nr=<optimized out>, regs=0xffffc900004c7f58) at arch/x86/entry/common.c:52
#10 do_syscall_64 (regs=0xffffc900004c7f58, nr=<optimized out>) at arch/x86/entry/common.c:83
#11 0xffffffff82000126 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
#12 0x0000000000000000 in ?? ()

Now, when we look at frame #1:

ffffffff811a9800 <do_init_module>:
ffffffff811a9800: e8 bb 36 ee ff call ffffffff8108cec0 <__fentry__>
ffffffff811a9805: 41 57 push %r15
ffffffff811a9807: 41 56 push %r14
ffffffff811a9809: 41 55 push %r13
ffffffff811a980b: 41 54 push %r12
ffffffff811a980d: 55 push %rbp
ffffffff811a980e: 53 push %rbx
ffffffff811a980f: 48 89 fb mov %rdi,%rbx
ffffffff811a9812: 48 c7 c7 c8 9f 6a 82 mov $0xffffffff826a9fc8,%rdi
ffffffff811a9819: 48 83 ec 08 sub $0x8,%rsp
ffffffff811a981d: e8 5e 51 0d 00 call ffffffff8127e980 <__tsan_read8>
ffffffff811a9822: 48 8b 3d 9f 07 50 01 mov 0x150079f(%rip),%rdi # ffffffff826a9fc8 <kmalloc_caches+0x28>

..

ffffffff811a98ec: e8 8f 50 0d 00 call ffffffff8127e980 <__tsan_read8>
ffffffff811a98f1: 49 8b 07 mov (%r15),%rax
ffffffff811a98f4: e8 27 d1 e3 00 call ffffffff81fe6a20 <__x86_indirect_thunk_array>
ffffffff811a98f9: 4c 89 ef mov %r13,%rdi

there's that call to the indirect array. Which is in the static kernel image:

ffffffff81fe6a20 <__x86_indirect_thunk_array>:
ffffffff81fe6a20: e8 01 00 00 00 call ffffffff81fe6a26 <__x86_indirect_thunk_array+0x6>
ffffffff81fe6a25: cc int3
ffffffff81fe6a26: 48 89 04 24 mov %rax,(%rsp)
ffffffff81fe6a2a: e9 b1 07 00 00 jmp ffffffff81fe71e0 <__x86_return_thunk>

where you'd think, ah, yes, that's why it fires.

BUT! The live kernel image in gdb looks like this:

Dump of assembler code for function __x86_indirect_thunk_array:
0xffffffff81fe6a20 <+0>: call 0xffffffff81fe6a26 <__x86_indirect_thunk_array+6>
0xffffffff81fe6a25 <+5>: int3
0xffffffff81fe6a26 <+6>: mov %rax,(%rsp)
0xffffffff81fe6a2a <+10>: jmp 0xffffffff81fe70a0 <srso_return_thunk>

so the right thunk is already there!

And yet, the warning still fired.

I need to singlestep this whole loading bit more carefully.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette