Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess

From: Peter Zijlstra
Date: Thu Aug 10 2023 - 08:38:29 EST


On Thu, Aug 10, 2023 at 01:51:48PM +0200, Borislav Petkov wrote:
> On Wed, Aug 09, 2023 at 09:12:20AM +0200, Peter Zijlstra wrote:
> > Where Zen1/2 flush the BTB using the instruction decoder trick
> > (test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in
>
> BTB aliasing.
>
> > AMD speak) stuffing to force a return mis-predict.
>
> No it doesn't. It causes BTB aliasing which evicts any potentially
> poisoned entries.

It does; so zen1/2 use the decoder thing to flush BTB entry of the RET,
both retbleed and srso do.

Then zen3/4 use the aliassing trick to flush the BTB entry of the RET.

Then both srso options use RSB/RAP stuffing to force a mispredict there.
Retbleed doesn't do this.

retbleed is about BTB, srso does both BTB and RSB/RAP.

> > That is; the AMD retbleed is a form of Speculative-Type-Confusion
> > where the branch predictor is trained to use the BTB to predict the
> > RET address, while AMD inception/SRSO is a form of
> > Speculative-Type-Confusion where another instruction is trained to be
> > treated like a CALL instruction and poison the RSB (RAP).
>
> Nope, Andy explained it already in the 0th message.

I'm still of the opinion that branch-type-confusion is an integral part
of setting up the srso RSB/RAP trickery. It just targets a different
predictor, RSB/RAP vs BTB.

> > Pick one of three options at boot.
>
> Yes, provided microarchitecturally that works, I'm all for removing the
> __ret alternative.

So this patch doesn't actually change anything except one layer of
indirection.

Your thing does:

SYNC_FUNC_START(foo)
...
ALTERNATIVE "ret; int3",
"jmp __x86_return_thunk", X86_FEATURE_RETHUNK
SYM_FUNC_END(foo)

SYM_FUNC_START(__x86_return_thunk)
ALTERNATIVE("jmp __ret",
"call srso_safe_ret", X86_FEATURE_SRSO,
"call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS);
int3
SYM_FUNC_END(__x86_return_thunk)


So what was RET, jumps to __x86_return_thunk, which then jumps to the
actual return thunk.

After this patch things look equivalent to:

SYM_FUNC_START(foo)
...
ALTERNATIVE "ret; int3"
"jmp __x86_return_thunk", X86_FEATURE_RETHUNK
"jmp srso_return_thunk, X86_FEATURE_SRSO
"jmp srsi_alias_return_thunk", X86_FEATURE_SRSO_ALIAS
SYM_FUNC_END(foo)

SYM_CODE_START(srso_return_thunk)
UNWIND_HINT_FUNC
ANNOTATE_NOENDBR
call srso_safe_ret;
ud2
SYM_CODE_END(srso_return_thunk)

SYM_CODE_START(srso_alias_return_thunk)
UNWIND_HINT_FUNC
ANNOTATE_NOENDBR
call srso_alias_safe_ret;
ud2
SYM_CODE_END(srso_alias_return_thunk)


Except of course we don't have an actual ALTERNATIVE at the ret site,
but .return_sites and rewriting things to either "ret; int3" or whatever
function is in x86_return_thunk.


Before this patch, only one ret thunk is used at any one time, after
this patch still only one ret thunk is used.

fundamentally, you can only ever use one ret.

IOW this patch changes nothing for SRSO, it still does a jump to a call.
But it does clean up retbleed, which you had as a jump to a jump, back
to just a jump, and it does get rid of that extra alternative layer yo
had by using the one we already have at .return_sites rewrite.