RE: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: H. Peter Anvin
Date: Tue Jan 30 2024 - 10:24:01 EST


On January 30, 2024 5:10:20 AM PST, "Reshetova, Elena" <elena.reshetova@xxxxxxxxx> wrote:
>
>> Hi Kirill,
>>
>> I've been following the other discussion closely thinking about the
>> matter, but I suppose I'll jump in here directly on this patch, if
>> this is the approach the discussion is congealing around.
>>
>> A comment below:
>>
>> On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
>> <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
>> > static inline bool __must_check rdseed_long(unsigned long *v)
>> > {
>> > + unsigned int retry = RDRAND_RETRY_LOOPS;
>> > bool ok;
>> > - asm volatile("rdseed %[out]"
>> > - CC_SET(c)
>> > - : CC_OUT(c) (ok), [out] "=r" (*v));
>> > - return ok;
>> > +
>> > + do {
>> > + asm volatile("rdseed %[out]"
>> > + CC_SET(c)
>> > + : CC_OUT(c) (ok), [out] "=r" (*v));
>> > +
>> > + if (ok)
>> > + return true;
>> > + } while (--retry);
>> > +
>> > + return false;
>> > }
>>
>> So, my understanding of RDRAND vs RDSEED -- deliberately leaving out
>> any cryptographic discussion here -- is roughly that RDRAND will
>> expand the seed material for longer, while RDSEED will mostly always
>> try to sample more bits from the environment. AES is fast, while
>> sampling is slow, so RDRAND gives better performance and is less
>> likely to fail, whereas RDSEED always has to wait on the hardware to
>> collect some bits, so is more likely to fail.
>
>The internals of Intel DRBG behind RDRAND/RDSEED has been publicly
>documented, so the structure is no secret. Please see [1] for overall
>structure and other aspects. So, yes, your overall understanding is correct
>(there are many more details though).
>
>[1] https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html
>
>
>>
>> For that reason, most of the usage of RDRAND and RDSEED inside of
>> random.c is something to the tune of `if (!rdseed(out)) rdrand(out);`,
>> first trying RDSEED but falling back to RDRAND if it's busy. That
>> still seems to me like a reasonable approach, which this patch would
>> partly undermine (in concert with the next patch, which I'll comment
>> on in a follow up email there).
>
>I agree that for the purpose of extracting entropy for Linux RNG falling
>back to RDRAND (current behavior) is perfectly ok, so I think you are doing
>the right thing. However, in principle it is not always the case, there are
>situations when a fallback to RDRAND should not be used, but it is also
>true that the user of this interface should know/understand this situation.
>
>>
>> So maybe this patch #1 (of 2) can be dropped?
>
>Before we start debating this patchset, what is your opinion on the original
>problem we raised for CoCo VMs when both RDRAND/RDSEED are made to
>fail deliberately?
>
>Best Regards,
>Elena.
>
>

I have a real concern with this. We already have the option to let the entropy pool fill before the boot can proceed. This would have the risk of massively increasing the interrupt latency for what will be retried anyway.