Re: [RFC] x86/kvm/emulate: Avoid RET for fastops

From: Peter Zijlstra
Date: Wed Nov 29 2023 - 02:21:17 EST


On Tue, Nov 28, 2023 at 05:37:52PM -0800, Sean Christopherson wrote:
> On Sun, Nov 12, 2023, Peter Zijlstra wrote:
> > Hi,
> >
> > Inspired by the likes of ba5ca5e5e6a1 ("x86/retpoline: Don't clobber
> > RFLAGS during srso_safe_ret()") I had it on my TODO to look at this,
> > because the call-depth-tracking rethunk definitely also clobbers flags
> > and that's a ton harder to fix.
> >
> > Looking at this recently I noticed that there's really only one callsite
> > (twice, the testcc thing is basically separate from the rest of the
> > fastop stuff) and thus CALL+RET is totally silly, we can JMP+JMP.
> >
> > The below implements this, and aside from objtool going apeshit (it
> > fails to recognise the fastop JMP_NOSPEC as a jump-table and instead
> > classifies it as a tail-call), it actually builds and the asm looks
> > good sensible enough.
> >
> > I've not yet figured out how to test this stuff, but does something like
> > this look sane to you guys?
>
> Yes? The idea seems sound, but I haven't thought _that_ hard about whether or not
> there's any possible gotchas. I did a quick test and nothing exploded (and
> usually when this code breaks, it breaks spectacularly).

That's encouraging..

> > Given that rethunks are quite fat and slow, this could be sold as a
> > performance optimization I suppose.
> >
> > ---
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index f93e9b96927a..2cd3b5a46e7a 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -412,6 +412,17 @@ static inline void call_depth_return_thunk(void) {}
> > "call *%[thunk_target]\n", \
> > X86_FEATURE_RETPOLINE_LFENCE)
> >
> > +# define JMP_NOSPEC \
> > + ALTERNATIVE_2( \
> > + ANNOTATE_RETPOLINE_SAFE \
> > + "jmp *%[thunk_target]\n", \
> > + "jmp __x86_indirect_thunk_%V[thunk_target]\n", \
> > + X86_FEATURE_RETPOLINE, \
> > + "lfence;\n" \
> > + ANNOTATE_RETPOLINE_SAFE \
> > + "jmp *%[thunk_target]\n", \
> > + X86_FEATURE_RETPOLINE_LFENCE)
>
> There needs a 32-bit version (eww) and a CONFIG_RETPOLINE=n version. :-/

I'll go make that happen. Thanks!