Re: [RFC PATCH 0/2] Propagating reseed notifications to user space

From: Alexander Graf
Date: Mon Sep 18 2023 - 04:33:39 EST


Hey Yann!

On 17.09.23 15:34, Yann Droneaud wrote:

Hi,

Le 23/08/2023 à 11:01, Babis Chalios a écrit :
User space often implements PRNGs that use /dev/random as entropy
source. We can not expect that this randomness sources stay completely
unknown forever. For various reasons, the originating PRNG seed may
become known at which point the PRNG becomes insecure for further random
number generation. Events that can lead to that are for example fast
computers reversing the PRNG function using a number of inputs or
Virtual Machine clones which carry seed values into their clones.

During LPC 2022 Jason, Alex, Michael and me brainstormed on how to
atomically expose a notification to user space that it should reseed.
Atomicity is key for the VM clone case. This patchset implements a
potential path to do so.

This patchset introduces an epoch value as the means of communicating to
the guest the need to reseed. The epoch is a 32bit value with the
following form:

               RNG epoch
*-------------*---------------------*
| notifier id | epoch counter value |
*-------------*---------------------*
      8 bits           24 bits

Changes in this value signal moments in time that PRNGs need to be
re-seeded. As a result, the intended use of the epoch from user space
PRNGs is to cache the epoch value every time they reseed using kernel
entropy, then control that its value hasn't changed before giving out
random numbers. If the value has changed the PRNG needs to reseed before
producing any more random bits.

The API for getting hold of this value is offered through
/dev/(u)random. We introduce a new ioctl for these devices, which
creates an anonymous file descriptor. User processes can call the
ioctl() to get the anon fd and then mmap it to a single page. That page
contains the value of the epoch at offset 0.

Naturally, random.c is the component that maintains the RNG epoch.
During initialization it allocates a single global page which holds the
epoch value. Moreover, it exposes an API to kernel subsystems
(notifiers) which can report events that require PRNG reseeding.
Notifiers register with random.c and receive an 8-bit notifier id (up to
256 subscribers should be enough) and a pointer to the epoch. Notifying,
then, is equivalent to writing in the epoch address a new epoch value.

Notifiers write epoch values that include the notifier ID on the higher
8 bits and increasing counter values on the 24 remaining bits. This
guarantees that two notifiers cannot ever write the same epoch value,
since notificator IDs are unique.

The first patch of this series implements the epoch mechanism. It adds
the logic in the random.c to maintain the epoch page and expose the
user space facing API. It also adds the internal API that allows kernel
systems to register as notifiers.

From userspace point of view, having to open /dev/random, ioctl, and mmap()
is a no-go for a (CS)PRNG embedded in libc for arc4random().


Could you please elaborate on why it's a no-go? With any approach we take, someone somewhere needs to map and expose data to user space that we are in a new "epoch". With this patch set, you do that explicitly from user space through an fd that you keep open plus an mmap that you keep active. With vgetrandom, the kernel does it implicitly for you.

So with this patch set's approach, the first call to arc4random() would need to establish the epoch mmap and leave it open. After that epoch handling is (almost) free - it's just a 32bit value compare.

Are you saying that there is a problem with keeping track of that additional state? As mentioned above, we need to keep track of some state somewhere: Either in the vdso plus kernel page map logic or in the library that consumes epochs.

If this is the problem, maybe the fundamental issue is that arc4random() assumes you always have everything in place to receive randomness without a handle that could go through an open/close (init/destroy) cycle? I suppose you could change that?

I'm biased, as I proposed to expose such seed epoch value to userspace through
getrandom() directly, relying on vDSO for reasonable performances, because
current's glibc arc4random() is somewhat to slow to be a general replacement
rand().

See
https://lore.kernel.org/all/cover.1673539719.git.ydroneaud@xxxxxxxxxx/
https://lore.kernel.org/all/ae35afa5b824dc76c5ded98efcabc117e6dd3d70@xxxxxxxxxx/


There are more problems with coupling epochs to the vgetrandom approach: Not everyone will want to or can use Linux's rng as the sole source of entropy for various reasons (NIST, FIPS, TLS recommendations to not rely on a single source, real time requirements, etc) but still require knowledge of epoch changes.

That means we need an alternative path for these applications regardless. May as well start with that :). If we then still conclude that vgetrandom is the best path forward to accelerate access to /dev/urandom in user space, we can just always map this patch set's epoch page into the vDSO range and then make vgetrandom consume it, similar to how a user space library would.

I genuinely don't understand how vgetrandom and this patch set contradict each other.


Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879