Re: [PATCH v3] x86/entry: emit a symbol for register restoring thunk

From: Fāng-ruì Sòng
Date: Mon Jan 11 2021 - 17:18:07 EST


On Mon, Jan 11, 2021 at 2:09 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> On Mon, Jan 11, 2021 at 12:58:14PM -0800, Fangrui Song wrote:
> > On 2021-01-11, Nick Desaulniers wrote:
> > > Arnd found a randconfig that produces the warning:
> > >
> > > arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
> > > offset 0x3e
> > >
> > > when building with LLVM_IAS=1 (use Clang's integrated assembler). Josh
> > > notes:
> > >
> > > With the LLVM assembler stripping the .text section symbol, objtool
> > > has no way to reference this code when it generates ORC unwinder
> > > entries, because this code is outside of any ELF function.
> > >
> > > Fangrui notes that this optimization is helpful for reducing images size
> > > when compiling with -ffunction-sections and -fdata-sections. I have
> > > observerd on the order of tens of thousands of symbols for the kernel
> > > images built with those flags. A patch has been authored against GNU
> > > binutils to match this behavior, with a new flag
> > > --generate-unused-section-symbols=[yes|no].
> >
> > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
> > has been committed. The patch should be included in binutils 2.37.
> > The maintainers are welcome to the idea, but fixing all the arch-specific tests is tricky.
> >
> > H.J. fixed the x86 tests and enabled this for x86. When binutils 2.37
> > come out, some other architectures may follow as well.
> >
> > > We can omit the .L prefix on a label to emit an entry into the symbol
> > > table for the label, with STB_LOCAL binding. This enables objtool to
> > > generate proper unwind info here with LLVM_IAS=1.
> >
> > Josh, I think objtool orc generate needs to synthesize STT_SECTION
> > symbols even if they do not exist in object files.
>
> I'm guessing you don't mean re-adding *all* missing STT_SECTIONs, as
> that would just be undoing these new assembler features.
>
> We could re-add STT_SECTION only when there's no other corresponding
> symbol associated with the code, but then objtool would have to start
> updating the symbol table (which right now it manages to completely
> avoid). But that would only be for the niche cases, like
> 'SYM_CODE.*\.L' as you mentioned.
>
> I'd rather avoid making doing something so pervasive for such a small
> number of edge cases. It's hopefully easier and more robust to just say
> "all code must be associated with a symbol". I suspect we're already
> ~99.99% there anyway.
>
> $ git grep -e 'SYM_CODE.*\.L'
> arch/x86/entry/entry_64.S:SYM_CODE_START_LOCAL_NOALIGN(.Lbad_gs)
> arch/x86/entry/entry_64.S:SYM_CODE_END(.Lbad_gs)
> arch/x86/entry/thunk_64.S:SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
> arch/x86/entry/thunk_64.S:SYM_CODE_END(.L_restore)
> arch/x86/lib/copy_user_64.S:SYM_CODE_START_LOCAL(.Lcopy_user_handle_tail)
> arch/x86/lib/copy_user_64.S:SYM_CODE_END(.Lcopy_user_handle_tail)
> arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_clac)
> arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_clac)
> arch/x86/lib/getuser.S:SYM_CODE_START_LOCAL(.Lbad_get_user_8_clac)
> arch/x86/lib/getuser.S:SYM_CODE_END(.Lbad_get_user_8_clac)
> arch/x86/lib/putuser.S:SYM_CODE_START_LOCAL(.Lbad_put_user_clac)
> arch/x86/lib/putuser.S:SYM_CODE_END(.Lbad_put_user_clac)

I'd prefer that the assembly can continue using .L and does not know
the objtool limitation.
Assemblers normally drop .L symbols. These symbols are otherwise not useful.

However, if as you said, teaching objtool about synthesizing
STT_SECTION from section header table is difficult,
this patch looks fine to me.

Reviewed-by: Fangrui Song <maskray@xxxxxxxxxx>

> Alternatively, the assemblers could add an option to only strip
> -ffunction-sections and -fdata-sections STT_SECTION symbols, e.g. leave
> ".text" and friends alone.

I forgot to mention that --generate-unused-section-symbols=[yes|no] is
not added to GNU as.
Making the assembler behavior dependent on -ffunction-sections is not
an option in both LLVM integrated assembler and GNU as.