Re: [PATCH] random: allow writes to /dev/urandom to influence fast init

From: Jason A. Donenfeld
Date: Tue Mar 22 2022 - 19:55:09 EST


Hey Linus,

On Tue, Mar 22, 2022 at 2:42 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Mar 22, 2022 at 12:15 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
> >
> > @@ -1507,6 +1507,8 @@ static int write_pool(const char __user *ubuf, size_t count)
> > }
> > count -= len;
> > ubuf += len;
> > + if (unlikely(crng_init == 0 && !will_credit))
> > + crng_pre_init_inject(block, len, false);
> > mix_pool_bytes(block, len);
> > cond_resched();
> > }
>
> Ugh. I hate that whole crng_pre_init_inject() dance.

Yea it's not great, but it's a helluva lot better than what was there
before 5.18, when there were two different ways of mixing into the
crng key, both of which were insecure. At least now it's a hash
function. As I mentioned in that pull request, I tried to shore up the
existing design as much as I could without departing from it in any
large fundamental ways. It may well be time to revisit the pre init
situation though.

Maybe it'd help if I described what the goals are of the current
approach and how it attempts to accomplish that.

The chacha key that's used for random bytes is separate from the input
pool, which is where entropy gets dumped. The reason for that
separation is because it's important that we decide when and under
what conditions to extract from the input pool into a new chacha key.
If we do it too often or prematurely, then we risk updating the chacha
key with very few new bits of entropy. If there aren't enough new
bits, and an attacker knows the old state (say, because the system
just booted), then it's trivial to bruteforce those new bits, since
read access to /dev/urandom is unprivileged. That's very bad.

For that reason, we only reseed when we've collected 256 bits of
entropy, and at a schedule of uptime/2 for the first 10 minutes, and
then every 5 minutes after that.

The question, here, is what to do before we've collected 256 bits of
entropy. 80 bits is still better than 0, for example. The existing
scheme (from Ted) is that we maintain a state variable, crng_init.
When crng_init=0, we direct all entropy into the chacha key directly.
If that entropy is creditable, then we account for up to 64 bytes of
input total, regardless of how many "bits" of entropy we would
otherwise be crediting were it not pre-init phase. That's kinda weird,
but the motivating mantra has always been, "fast init is garbage, but
it's better than nothing." Then, once we hit 64 bytes of fast init
accounting, we transition to crng_init=1. At this stage, we're putting
bytes into the normal input pool and crediting it as usual, not
updating the crng key directly anymore with injection, and the crng is
still blocked. When we hit 256 bits of accounted entropy from the
crng_init=1 stage, then we're ready to go, and we transition to
crng_init=2.

So this patch helps with the crng_init=0 case. It doesn't address the
crng_init=1 case, when the bytes written to /dev/urandom won't be
directly injected as they are for crng_init=0. In any case, the one
thing we're trying to preserve in all of this is that when we do
transition to crng_init=2, it's with a pool that contains 256 bits of
entropy that haven't previously been exposed anywhere, so that they
can't be brute forced.

With that as background, to answer your question:

> Maybe I'm missing something. But it seems kind of silly to use
> base_crng AT ALL before crng_ready(). Why not use the pool we have
> that *is* actually updated (that 'input_pool')?

In this case, base_crng.key is really just doubling as a separate
pool. The "pre init pool", the "fast init pool", the "super terrible
but at least not zero pool", the "all bets are off pool", ... whatever
you want to call it. Why a separate pool for pre init? Because the
real input pool needs to accumulate 256 bits of entropy before it's
safe to use.

Your suggestion is to instead not have a separate pool, but perhaps
just do separate accounting. That might work to some degree, but the
devil is in the details, and that sounds a lot harder and messier to
code. For example, you wrote:

> + crng_fast_key_erasure(input_pool.key, chacha_state,
> + random_data, random_data_len);

Except there is no input_pool.key, because the input_pool's state is
actually an un-finalized hash function. So what you wind up with
instead is calling extract_entropy() on every read. But then you have
to decide a certain point when you stop doing that, so it can
accumulate 256 bits prior to exposure, and things quickly get
extremely messy. So I don't think your suggestion improves much.

The long term solution to this stuff might be doing away with the
entropy accounting entirely, as I suggested in the other thread. The
shorter term solution, though, might be reimagining how the
crng_init=1/2 states work to begin with. (And the shortest-term "fix"
is this patch, which while not perfect is at least something.)

Just sort of spitballing what the shorter term solution might be, we
could do something exponential, where we get rid of the
pre_init_inject stuff entirely, but call `crng_reseed(force=true)` at
1 bit, 2 bits, 4 bits, 8 bits, 16 bits, 32 bits, ... 256 bits, the
same way we do now with the time. (This is somewhat similar to what NT
does, fwiw.) A downside is that the gap between, say, 64 bits and 128
bits is much "longer" than what we have now with pre_init_inject.

While the above might be an improvement, that doesn't help with our
/dev/urandom problem, though. Perhaps for that we could have some
messy heuristic, like `if (capable(CAP_SYS_ADMIN) && !crng_ready())
crng_reseed(force=true);`. That's sort of ugly too, but it would help
with the /dev/urandom shellscript seeders.

I'll think on it some more, but these two spitball ideas together
might result in a nice simplification by eliminating the fast pool
entirely. Happy to hear more ideas from you too if the above inspires
anything.

Jason