Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=

From: Peter Zijlstra
Date: Wed Aug 09 2023 - 10:06:20 EST


On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> > retbleed_cmd = RETBLEED_CMD_AUTO;
> > } else if (!strcmp(str, "unret")) {
> > retbleed_cmd = RETBLEED_CMD_UNRET;
> > + } else if (!strcmp(str, "srso")) {
> > + retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > + } else if (!strcmp(str, "srso_alias")) {
> > + retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
>
> It doesn't make sense for "srso_alias" to be a separate cmdline option,
> as that option is a model-dependent variant of the SRSO mitigation.

so what I did with retbleed, and what should be fixed here too (I
forgot) is run with:

retbleed=force,unret

on any random machine (typically Intel, because I have a distinct lack
of AMD machines :-() and look at the life kernel image to see if all the
patching worked.

I suppose I should add:

setup_force_cpu_bug(X86_BUG_SRSO);

to the 'force' option, then:

retbleed=force,srso_alias

should function the same, irrespective of the hardware.

I'm also of the opinion that the kernel should do as told, even if it
doesn't make sense. If you tell it nonsense, you get to keep the pieces.

So in that light, yes I think we should have separate options.

> > + if (boot_cpu_has_bug(X86_BUG_SRSO)) {
> > + has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> > + if (!has_microcode) {
> > + pr_warn("IBPB-extending microcode not applied!\n");
> > + pr_warn(RETBLEED_SRSO_NOTICE);
> > + } else {
> > + /*
> > + * Enable the synthetic (even if in a real CPUID leaf)
> > + * flags for guests.
> > + */
> > + setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> > + setup_force_cpu_cap(X86_FEATURE_SBPB);
> > +
> > + /*
> > + * Zen1/2 with SMT off aren't vulnerable after the right
> > + * IBPB microcode has been applied.
> > + */
> > + if ((boot_cpu_data.x86 < 0x19) &&
> > + (cpu_smt_control == CPU_SMT_DISABLED))
> > + setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
>
> The rumor I heard was that SMT had to be disabled specifically by BIOS
> for this condition to be true. Can somebody from AMD confirm?

David Kaplan is on Cc, I was hoping he would enlighten us -- once he's
back from PTO.

> The above check doesn't even remotely do that, in fact it does the
> opposite. Regardless, at the very least it should be checking
> cpu_smt_possible(). I guess that would be a separate patch as it's a
> bug fix.
>
> > @@ -870,8 +915,17 @@ static void __init retbleed_select_mitig
> > default:
> > if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
> > boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
> > - if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
> > - retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > + if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
> > + if (boot_cpu_has_bug(X86_BUG_RETBLEED))
> > + retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > +
> > + if (boot_cpu_has_bug(X86_BUG_SRSO) && !boot_cpu_has(X86_FEATURE_SRSO_NO)) {
> > + if (boot_cpu_data.x86 == 0x19)
> > + retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
> > + else
> > + retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
>
> It would be great to get confirmation from somebody at AMD that the SRSO
> mitigations supersede the Retbleed one, i.e., that the SRSO mitigations
> also fix Retbleed.

They should, the discussions we had back then explained the Zen1/2
retbleed case in quite some detail and the srso case matches that
exactly with the movabs. A larger instruction is used because we need a
larger embedded sequence of instructions, but otherwise it is identical.

The comments provided for srso_alias state the BTB is untrained using
the explicit aliasing.

That is to say, AFAIU any of this, yes both srso options untrain the BTB
and mitigate the earlier retbleed thing.

SRSO then goes one step further with the RAP/RSB clobber.

> Yes, the original patches might imply that, but they also seem confused
> since they do the double untraining for both Retbleed and SRSO. And I
> was given conflicting information.

Which makes no bloody sense, since the double untrain is for another
address, which is not at all used.

> Even better, we could really use some official AMD documentation for
> this mitigation, since right now it all feels very unofficial and
> ubsubstantiated.

Seconded. David is there anything you can do to clarify this stuff?

>
> > + }
> > + }
> > else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
> > retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
>
> Here should have the microcode check too:
>
> if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
> pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");

That earlier printk is unconditional of the selected mitigation, you
really want it printed again?