Re: [RFC] Randomness on confidential computing platforms

From: H. Peter Anvin
Date: Mon Jan 29 2024 - 16:39:15 EST


On January 29, 2024 1:17:07 PM PST, "H. Peter Anvin" <hpa@xxxxxxxxx> wrote:
>On January 29, 2024 1:04:23 PM PST, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>>On 1/29/24 12:26, Kirill A. Shutemov wrote:
>>>>> Do we care?
>>>> I want to make sure I understand the scenario:
>>>>
>>>> 1. We're running in a guest under TDX (or SEV-SNP)
>>>> 2. The VMM (or somebody) is attacking the guest by eating all the
>>>> hardware entropy and RDRAND is effectively busted
>>>> 3. Assuming kernel-based panic_on_warn and WARN_ON() rdrand_long()
>>>> failure, that rdrand_long() never gets called.
>>> Never gets called during attack. It can be used before and after.
>>>
>>>> 4. Userspace is using RDRAND output in some critical place like key
>>>> generation and is not checking it for failure, nor mixing it with
>>>> entropy from any other source
>>>> 5. Userspace uses the failed RDRAND output to generate a key
>>>> 6. Someone exploits the horrible key
>>>>
>>>> Is that it?
>>> Yes.
>>
>>Is there something that fundamentally makes this a VMM vs. TDX guest
>>problem? If a malicious VMM can exhaust RDRAND, why can't malicious
>>userspace do the same?
>>
>>Let's assume buggy userspace exists. Is that userspace *uniquely*
>>exposed to a naughty VMM or is that VMM just added to the list of things
>>that can attack buggy userspace?
>
>The concern, I believe, is that a TDX guest is vulnerable as a *victim*, especially if the OS is being malicious.
>
>However, as you say a malicious user space including a conventional VM could try to use it to attack another. The only thing we can do in the kernel about that is to be resilient.
>
>Note that there is an option to the kernel to suspend boot until enough entropy has been gathered that predicting the output of the entropy pool in the kernel ought to be equivalent to breaking AES (in which case we have far worse problems.) To harden the VM case in general perhaps we should consider RDRAND to have zero entropy credit when used as a fallback for RDSEED.
>

It is probably worth pointing out, too, that in reality the specs for RDRAND/RDSEED are *extremely* sandbagged. The architect told me that it is extremely unlikely that we will *ever* see a failure due to exhaustion, even if it is executed continuously on all cores – the randomness production rate exceeds the bandwidth of the bus in uncore.