Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: Tom Lendacky
Date: Wed Feb 14 2024 - 14:47:00 EST


On 2/14/24 11:21, Jason A. Donenfeld wrote:
Hi Elena,

On Wed, Feb 14, 2024 at 4:18 PM Reshetova, Elena <elena.reshetova@xxxxxxxxx> wrote:
"The RdRand in a non-defective device is designed to be faster than the bus,
so when a core accesses the output from the DRNG, it will always get a
random number.
As a result, it is hard to envision a scenario where the RdRand, on a fully
functional device, will underflow.
The carry flag after RdRand signals an underflow so in the case of a defective chip,
this will prevent the code thinking it has a random number when it does not.

That's really great news, especially combined with a very similar
statement from Borislav about AMD chips:

On Fri, Feb 9, 2024 at 10:45 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
Yeah, I know exactly what you mean and I won't go into details for
obvious reasons. Two things:

* Starting with Zen3, provided properly configured hw RDRAND will never
fail. It is also fair when feeding the different contexts.

I assume that this faster-than-the-bus-ness also takes into account the
various accesses required to even switch contexts when scheduling VMs,
so your proposed host-guest scheduling attack can't really happen
either. Correct?

One clarifying question in all of this: what is the point of the "try 10
times" advice? Is the "faster than the bus" statement actually "faster
than the bus if you try 10 times"? Or is the "10 times" advice just old
and not relevant.

In other words, is the following a reasonable patch?

diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..2d5bf5aa9774 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -13,22 +13,16 @@
#include <asm/processor.h>
#include <asm/cpufeature.h>
-#define RDRAND_RETRY_LOOPS 10
-
/* Unconditional execution of RDRAND and RDSEED */
static inline bool __must_check rdrand_long(unsigned long *v)
{
bool ok;
- unsigned int retry = RDRAND_RETRY_LOOPS;
- do {
- asm volatile("rdrand %[out]"
- CC_SET(c)
- : CC_OUT(c) (ok), [out] "=r" (*v));
- if (ok)
- return true;
- } while (--retry);
- return false;
+ asm volatile("rdrand %[out]"
+ CC_SET(c)
+ : CC_OUT(c) (ok), [out] "=r" (*v));
+ WARN_ON(!ok);
+ return ok;

Don't forget that Linux will run on older hardware as well, so the 10 retries might be valid for that. Or do you intend this change purely for CVMs?

Thanks,
Tom

}
static inline bool __must_check rdseed_long(unsigned long *v)

(As for the RDSEED clarification, that also matches Borislav's reply, is
what we expected and knew experimentally, and doesn't really have any
bearing on Linux's RNG or this discussion, since RDRAND is all we need
anyway.)

Regards,
Jason