Re: [PATCH v3] x86/entry: emit a symbol for register restoring thunk

From: Josh Poimboeuf
Date: Mon Jan 11 2021 - 17:11:02 EST


On Mon, Jan 11, 2021 at 12:58:14PM -0800, Fangrui Song wrote:
> On 2021-01-11, Nick Desaulniers wrote:
> > Arnd found a randconfig that produces the warning:
> >
> > arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
> > offset 0x3e
> >
> > when building with LLVM_IAS=1 (use Clang's integrated assembler). Josh
> > notes:
> >
> > With the LLVM assembler stripping the .text section symbol, objtool
> > has no way to reference this code when it generates ORC unwinder
> > entries, because this code is outside of any ELF function.
> >
> > Fangrui notes that this optimization is helpful for reducing images size
> > when compiling with -ffunction-sections and -fdata-sections. I have
> > observerd on the order of tens of thousands of symbols for the kernel
> > images built with those flags. A patch has been authored against GNU
> > binutils to match this behavior, with a new flag
> > --generate-unused-section-symbols=[yes|no].
>
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
> has been committed. The patch should be included in binutils 2.37.
> The maintainers are welcome to the idea, but fixing all the arch-specific tests is tricky.
>
> H.J. fixed the x86 tests and enabled this for x86. When binutils 2.37
> come out, some other architectures may follow as well.
>
> > We can omit the .L prefix on a label to emit an entry into the symbol
> > table for the label, with STB_LOCAL binding. This enables objtool to
> > generate proper unwind info here with LLVM_IAS=1.
>
> Josh, I think objtool orc generate needs to synthesize STT_SECTION
> symbols even if they do not exist in object files.

I'm guessing you don't mean re-adding *all* missing STT_SECTIONs, as
that would just be undoing these new assembler features.

We could re-add STT_SECTION only when there's no other corresponding
symbol associated with the code, but then objtool would have to start
updating the symbol table (which right now it manages to completely
avoid). But that would only be for the niche cases, like
'SYM_CODE.*\.L' as you mentioned.

I'd rather avoid making doing something so pervasive for such a small
number of edge cases. It's hopefully easier and more robust to just say
"all code must be associated with a symbol". I suspect we're already
~99.99% there anyway.

$ git grep -e 'SYM_CODE.*\.L'
arch/x86/entry/entry_64.S:SYM_CODE_START_LOCAL_NOALIGN(.Lbad_gs)
arch/x86/entry/entry_64.S:SYM_CODE_END(.Lbad_gs)
arch/x86/entry/thunk_64.S:SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
arch/x86/entry/thunk_64.S:SYM_CODE_END(.L_restore)
arch/x86/lib/copy_user_64.S:SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail)
arch/x86/lib/copy_user_64.S:SYM_CODE_END(.Lcopy_user_handle_tail)
arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_clac)
arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_clac)
arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_8_clac)
arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_8_clac)
arch/x86/lib/putuser.S:SYM_CODE_START_LOCAL(.Lbad_put_user_clac)
arch/x86/lib/putuser.S:SYM_CODE_END(.Lbad_put_user_clac)

Alternatively, the assemblers could add an option to only strip
-ffunction-sections and -fdata-sections STT_SECTION symbols, e.g. leave
".text" and friends alone.

--
Josh