Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: Dave Hansen
Date: Wed Feb 14 2024 - 15:14:57 EST


On 2/14/24 09:21, Jason A. Donenfeld wrote:
> One clarifying question in all of this: what is the point of the "try 10
> times" advice? Is the "faster than the bus" statement actually "faster
> than the bus if you try 10 times"? Or is the "10 times" advice just old
> and not relevant.
>
> In other words, is the following a reasonable patch?
>
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..2d5bf5aa9774 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,22 +13,16 @@
> #include <asm/processor.h>
> #include <asm/cpufeature.h>
>
> -#define RDRAND_RETRY_LOOPS 10
> -
> /* Unconditional execution of RDRAND and RDSEED */
>
> static inline bool __must_check rdrand_long(unsigned long *v)
> {
> bool ok;
> - unsigned int retry = RDRAND_RETRY_LOOPS;
> - do {
> - asm volatile("rdrand %[out]"
> - CC_SET(c)
> - : CC_OUT(c) (ok), [out] "=r" (*v));
> - if (ok)
> - return true;
> - } while (--retry);
> - return false;
> + asm volatile("rdrand %[out]"
> + CC_SET(c)
> + : CC_OUT(c) (ok), [out] "=r" (*v));
> + WARN_ON(!ok);
> + return ok;
> }

The key question here is if RDRAND can ever fail on perfectly good hardware.

I think it's theoretically possible for the entropy source health checks
to fail on perfectly good hardware for an arbitrarily long time. But
the odds of this happening to the point of it affecting RDRAND are
rather small.

There's a reason that the guidance says: "the odds of ten failures in a
row are astronomically small" _instead_ of claiming the same about a
single RDRAND.

Given the scale that the kernel operates at, I think we should leave the
loop.