Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=

From: Josh Poimboeuf
Date: Wed Aug 09 2023 - 11:43:46 EST


On Wed, Aug 09, 2023 at 05:08:52PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 10:28:47AM -0400, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 04:06:03PM +0200, Peter Zijlstra wrote:
> > > On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> > > > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > > > > static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > > > > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> > > > > retbleed_cmd = RETBLEED_CMD_AUTO;
> > > > > } else if (!strcmp(str, "unret")) {
> > > > > retbleed_cmd = RETBLEED_CMD_UNRET;
> > > > > + } else if (!strcmp(str, "srso")) {
> > > > > + retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > > > > + } else if (!strcmp(str, "srso_alias")) {
> > > > > + retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
> > > >
> > > > It doesn't make sense for "srso_alias" to be a separate cmdline option,
> > > > as that option is a model-dependent variant of the SRSO mitigation.
> > >
> > > so what I did with retbleed, and what should be fixed here too (I
> > > forgot) is run with:
> > >
> > > retbleed=force,unret
> > >
> > > on any random machine (typically Intel, because I have a distinct lack
> > > of AMD machines :-() and look at the life kernel image to see if all the
> > > patching worked.
> > >
> > > I suppose I should add:
> > >
> > > setup_force_cpu_bug(X86_BUG_SRSO);
> > >
> > > to the 'force' option, then:
> > >
> > > retbleed=force,srso_alias
> > >
> > > should function the same, irrespective of the hardware.
> > >
> > > I'm also of the opinion that the kernel should do as told, even if it
> > > doesn't make sense. If you tell it nonsense, you get to keep the pieces.
> > >
> > > So in that light, yes I think we should have separate options.
> >
> > What if I want the SRSO mitigation regardless of CPU model?
>
> You mean, you want to say 'any of the SRSO things, you pick the right
> version?'
>
> Which means the user has an AMD machine, but can't be arsed to figure
> out which and somehow doesn't want to use AUTO?

Well, nobody's going to use any of these options anyway so it doesn't
matter regardless.

> > > They should, the discussions we had back then explained the Zen1/2
> > > retbleed case in quite some detail and the srso case matches that
> > > exactly with the movabs. A larger instruction is used because we need a
> > > larger embedded sequence of instructions, but otherwise it is identical.
> > >
> > > The comments provided for srso_alias state the BTB is untrained using
> > > the explicit aliasing.
> > >
> > > That is to say, AFAIU any of this, yes both srso options untrain the BTB
> > > and mitigate the earlier retbleed thing.
> > >
> > > SRSO then goes one step further with the RAP/RSB clobber.
> >
> > Ah, nice. Please add that information somewhere (e.g., one of the
> > commit logs).
>
> The comment above zen_untrain_ret (or retbleed_untrain_ret as you've
> christened it) not clear enough?
>
> /*
> * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
> * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
> * alignment within the BTB.
> * 2) The instruction at retbleed_untrain_ret must contain, and not
> * end with, the 0xc3 byte of the RET.
> * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
> * from re-poisioning the BTB prediction.
> */

To me, it's only clear now that you connected the dots.

> Hmm, when compared to:
>
> .align 64
> .skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
> SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
> ANNOTATE_NOENDBR
> .byte 0x48, 0xb8
>
> SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
> add $8, %_ASM_SP
> ret
> int3
> int3
> int3
> /* end of movabs */
> lfence
> jmp srso_return_thunk
> int3
> SYM_CODE_END(srso_safe_ret)
> SYM_FUNC_END(srso_untrain_ret)
>
> Then we match 2, srso_safe_ret is strictly *inside* the movabs, that is,
> it is not the first nor the last byte of the outer instruction.
>
> However, we fail at 1, 'add $8, %rsp' sits at the boundary, not ret.
>
> Anybody, help ?

Um... yeah, doesn't look right.

--
Josh