Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches

From: Andrew . Cooper3
Date: Wed Aug 09 2023 - 06:04:27 EST


On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
>
> Anyway, this here overhauls the SRSO patches in a big way.
>
> I claim that AMD retbleed (also called Speculative-Type-Confusion

Branch Type Confusion.

Speculative Type Confusion is something else; generally Spectre v1 or v2
around a logical type check, usually ending up confusing pointers and
integer.

It appears that you might be suffering from Type-of-Speculative-Bug
Confusion, an affliction brought on by the chronic lack of documentation
and consistency, the fact that almost everything has at least 2 names,
and that 6 years in this horror show it's not showing any sign of
slowing down.

> -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.

BTC and SRSO are certainly related, but they're not the same.

With BTC, an attacker poisons a branch type prediction to say "that
thing (which isn't actually a ret) is a ret".

With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
Later, a real function (that is architecturally correct execution and
will retire) trips over the predicted infinite loop, which overflows the
RSB/RAS/RAP replacing the correct prediction on the top with the
attackers choice of value.

So while branch type confusion is used to poison the top-of-RSB value,
the ret that actually goes wrong needs a correct type=ret prediction for
the SRSO attack to succeed.


Both issues can be mitigated with IBPB-on-entry (given up-to-date
microcode in some cases).

Both issues have a software sequence that tries to make the contents of
a __x86_return_thunk sequence safe to use.  For BTC, it's simply a case
of ensuring the type prediction of the one ret is good.  For SRSO, it's
something more complicated and I don't know the uarch details fully.

~Andrew