Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=

From: Josh Poimboeuf
Date: Wed Aug 09 2023 - 10:39:23 EST


On Wed, Aug 09, 2023 at 03:31:20PM +0100, Andrew.Cooper3@xxxxxxxxxx wrote:
> On 09/08/2023 2:42 pm, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> >> + if (boot_cpu_has_bug(X86_BUG_SRSO)) {
> >> + has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> >> + if (!has_microcode) {
> >> + pr_warn("IBPB-extending microcode not applied!\n");
> >> + pr_warn(RETBLEED_SRSO_NOTICE);
> >> + } else {
> >> + /*
> >> + * Enable the synthetic (even if in a real CPUID leaf)
> >> + * flags for guests.
> >> + */
> >> + setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> >> + setup_force_cpu_cap(X86_FEATURE_SBPB);
> >> +
> >> + /*
> >> + * Zen1/2 with SMT off aren't vulnerable after the right
> >> + * IBPB microcode has been applied.
> >> + */
> >> + if ((boot_cpu_data.x86 < 0x19) &&
> >> + (cpu_smt_control == CPU_SMT_DISABLED))
> >> + setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
> > The rumor I heard was that SMT had to be disabled specifically by BIOS
> > for this condition to be true. Can somebody from AMD confirm?
>
> It's Complicated.
>
> On Zen1/2, uarch constraints mitigate SRSO when the core is in 1T mode,
> where such an attack would succeed in 2T mode.  Specifically, it is
> believed that the SRSO infinite-call-loop can poison more than 16
> RSB/RAS/RAP entries, but can't poison 32 entries.
>
> The RSB dynamically repartitions depending on the idleness of the
> sibling.  Therefore, offlining/parking the siblings should make you
> safe.  (Assuming you can handwave away the NMI hitting the parked thread
> case as outside of an attackers control.)
>
>
> In Xen, I decided that synthesizing SRSO_NO was only safe when SMT was
> disabled by firmware, because that's the only case where it can't cease
> being true later by admin action.
>
> If it were just Xen's safety that mattered here it might be ok to allow
> the OS SMT=0 cases, but this bit needs to get into guests, you can't
> credibly tell the guest SRSO_NO and then make it unsafe at a later point.

Thanks for that explanation. It sounds like we can use
!cpu_smt_possible() here.

--
Josh