RE: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't used at runtime

From: Kaplan, David
Date: Tue Oct 17 2023 - 09:54:58 EST


[AMD Official Use Only - General]

> -----Original Message-----
> From: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Sent: Tuesday, October 17, 2023 12:29 AM
> To: Kaplan, David <David.Kaplan@xxxxxxx>
> Cc: Nathan Chancellor <nathan@xxxxxxxxxx>; Borislav Petkov
> <bp@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; linux-tip-
> commits@xxxxxxxxxxxxxxx; Ingo Molnar <mingo@xxxxxxxxxx>; Peter Zijlstra
> (Intel) <peterz@xxxxxxxxxxxxx>; x86@xxxxxxxxxx; llvm@xxxxxxxxxxxxxxx
> Subject: Re: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't
> used at runtime
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On Tue, Oct 17, 2023 at 04:31:09AM +0000, Kaplan, David wrote:
> > I think I found the problem, although I'm not sure the best way to fix it.
> >
> > When KCSAN is enabled, GCC generates lots of constructor functions
> named _sub_I_00099_0 which call __tsan_init and then return. The returns
> in these are generally annotated normally by objtool and fixed up at runtime.
> But objtool runs on vmlinux.o and vmlinux.o does not include a couple of
> object files that are in vmlinux, like init/version-timestamp.o and
> .vmlinux.export.o, both of which contain _sub_I_00099_0 functions. As a
> result, the returns in these functions are not annotated, and the panic occurs
> when we call one of them in do_ctors and it uses the default return thunk.
> >
> > This difference can be seen by counting the number of these functions in
> the object files:
> > $ objdump -d vmlinux.o|grep -c "<_sub_I_00099_0>:"
> > 2601
> > $ objdump -d vmlinux|grep -c "<_sub_I_00099_0>:"
> > 2603
> >
> > If these functions are only run during kernel boot, there is no speculation
> concern. My first thought is that these two object files perhaps should be
> built without -mfunction-return=thunk-extern. The use of that flag requires
> objtool to have the intended behavior and objtool isn't seeing these files.
> >
> > Perhaps another option would be to not compile these two files with
> KCSAN, as they are already excluded from KASAN and GCOV it looks like.
>
> I think the latter would be the easy fix, does this make it go away?
>
> diff --git a/init/Makefile b/init/Makefile
> index ec557ada3c12..cbac576c57d6 100644
> --- a/init/Makefile
> +++ b/init/Makefile
> @@ -60,4 +60,5 @@ include/generated/utsversion.h: FORCE
> $(obj)/version-timestamp.o: include/generated/utsversion.h
> CFLAGS_version-timestamp.o := -include include/generated/utsversion.h
> KASAN_SANITIZE_version-timestamp.o := n
> +KCSAN_SANITIZE_version-timestamp.o := n
> GCOV_PROFILE_version-timestamp.o := n
> diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux
> index 3cd6ca15f390..c9f3e03124d7 100644
> --- a/scripts/Makefile.vmlinux
> +++ b/scripts/Makefile.vmlinux
> @@ -19,6 +19,7 @@ quiet_cmd_cc_o_c = CC $@
>
> ifdef CONFIG_MODULES
> KASAN_SANITIZE_.vmlinux.export.o := n
> +KCSAN_SANITIZE_.vmlinux.export.o := n
> GCOV_PROFILE_.vmlinux.export.o := n
> targets += .vmlinux.export.o
> vmlinux: .vmlinux.export.o

Yes, that worked for me. With this the VM booted and the number of these sub_I_00099_0 functions was consistent between vmlinux.o and vmlinux.

--David Kaplan