RE: [PATCH 1/2] x86/random: Retry on RDSEED failure

From: Reshetova, Elena
Date: Mon Feb 12 2024 - 03:25:46 EST


> Hi Kirill,
>
> On Sat, Feb 3, 2024 at 11:12 AM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
> > Yea, actually, I had a pretty similar idea for something like that
> > that's very non-invasive, where none of this even touches the RDRAND
> > core code, much less random.c. Specifically, we consider "adding some
> > extra RDRAND to the pool" like any other driver that wants to add some
> > of its own seeds to the pool, with add_device_randomness(), a call that
> > lives in various driver code, doesn't influence any entropy readiness
> > aspects of random.c, and can safely be sprinkled in any device or
> > platform driver.
> >
> > Specifically what I'm thinking about is something like:
> >
> > void coco_main_boottime_init_function_somewhere_deep_in_arch_code(void)
> > {
> > // [...]
> > // bring up primary CoCo nuts
> > // [...]
> >
> > /* CoCo requires an explicit RDRAND seed, because the host can make the
> > * rest of the system deterministic.
> > */
> > unsigned long seed[32 / sizeof(long)];
> > size_t i, longs;
> > for (i = 0; i < ARRAY_SIZE(seed); i += longs) {
> > longs = arch_get_random_longs(&seed[i], ARRAY_SIZE(seed) - i);
> > /* If RDRAND is being DoS'd, panic, because we can't ensure
> > * confidentiality.
> > */
> > BUG_ON(!longs);
> > }
> > add_device_randomness(seed, sizeof(seed));
> > memzero_explicit(seed, sizeof(seed));
> >
> > // [...]
> > // do other CoCo things
> > // [...]
> > }
> >
> > I would have no objection to the CoCo people adding something like this
> > and would give it my Ack, but more importantly, my Ack for that doesn't
> > even matter, because add_device_randomness() is pretty innocuous.
> >
> > So Kirill, if nobody else here objects to that approach, and you want to
> > implement it in some super minimal way like that, that would be fine
> > with me. Or maybe we want to wait for that internal inquiry at Intel to
> > return some answers first. But either way, this might be an easy
> > approach that doesn't add too much complexity.
>
> I went ahead and implemented this just to have something concrete out there:
> https://lore.kernel.org/all/20240209164946.4164052-1-Jason@xxxxxxxxx/
>
> I probably screwed up some x86 platform conventions/details, but
> that's the general idea I had in mind.
>

Thank you Jason!
I want to bring another potential idea here for a discussion, which Peter Anvin
proposed in our internal discussions, and I like it conceptually better
than any options we discussed so far since it is much more generic.

What if we instead of doing some special treatment on rdrand/seed, we
try to fix the underneath problem of Linux RNG not supporting CoCo threat
model. Linux RNG has almost set in stone definition of what sources contribute
entropy and what don’t (with some additional flexibility with flags like trust_cpu).
This works well for the current fixed threat model, but doesn’t work for
CoCo because some sources are suddenly not trusted anymore to contribute
entropy. However, some are still trusted and that is not just rdrand/rdseed,
but we would also trust add_hwgenerator_randomness (given that we use
TEE IO device here or have a way to get this input securely). So, even in
theoretical scenario that both rdrand/rdseed is broken (let's say HW failure),
a Linux RNG can actually boot securely in the guest if we have enough
entropy from add_hwgenerator_randomness.

So the change would be around adding the notion of conditional entropy
counting (we will always take input as we do now because it wont hurt),
which would automatically give us a correct behavior in _credit_init_bits()
for initial seeding of crng. Also we need to have a generic way to stop the
boot if the entropy is not increasing (for any reasons) and prevent booting
with insecurely seeded crng.

I do understand that this is going to be much bigger change than anything we
are discussing so far, but conceptually it sounds right to be able to have a say
what sources of entropy one trusts in runtime (probably applicable beyond
CoCo in the future also) and what is the action when we cannot collect the
entropy from these sources.

What does everyone think?

Best Regards,
Elena.