Re: [PATCH RFC v1] random: implement getrandom() in vDSO

From: Jason A. Donenfeld
Date: Sat Jul 30 2022 - 19:56:12 EST


Hey Linus,

Thanks a bunch for chiming in. Indeed this whole thing is kind of crazy,
so your input is particularly useful here.

On Sat, Jul 30, 2022 at 08:48:42AM -0700, Linus Torvalds wrote:
> It's just too specialized, and the people who care about performance
> can - and do - do special things anyway.

I followed most of your email, but I just wanted to point out that the
"can" part of this isn't quite right, though the "do" part is.
Specifically, I don't think there's currently a good way for userspace
to do this kind of thing and get the same kind of security guarantees
that the syscall has. They "do" it anyway, though (openssl, libgcrypt,
glibc's arc4random() implementation before I tamed it last week, etc),
and this is somewhat concerning.

So my larger intent is, assuming that people will continue to attempt
such things, to just nip the issue in the bud by providing an actually
safe way for it to be done.

To be clear, I really would rather not do this. I'm not really looking
for more stuff to do, and I don't tend to write (public) code "just
'cuz". My worry is that by /not/ doing it, footguns will proliferate.
The glibc thing was what finally motivated me to want to at least sketch
out a potential action to make this kind of (apparently common) urge of
writing a userspace RNG safer.

(Actually coding it up didn't really take much time, which perhaps
shows: that `if (!len)` check needs to be hoisted out of the inner
block!)

> So I'm really not convinced that this kind of thing is something the
> kernel should work that hard to help.
>
> Your patch fundamentally seems to be about "make it easy to not have
> to care, and still get high performance", but that's _such_ a small
> use-case (the intersection between "don't care" and "really really
> need high performance" would seem to be basically NIL).

So this is "statement (1)" stuff. Namely, userspace apparently wants
faster random numbers. Is this desire justified? Has anybody aside from
Phoronix even benchmarked getrandom() since I did the neat lockless
stuff to it? Is this just for some artificial card shuffling unit tests,
or is generating TLS CBC nonces at scale using getrandom() a real
bottleneck for a real use case?

I'm honestly not quite sure. But I do know that people are building
these userspace RNGs anyway, and will keep building them, and that kind
of worries me.

So either this is a useful thing to have, and people are building it
anyway, so maybe the kernel should get involved. Or it's not a useful
thing to have, BUT people are building it anyway, so maybe the kernel
should [not?] get involved? The latter case is a bit decisionally
hairier.

Anyway, onto the technical feedback:

> And that state allocation in particular looks very random in all the
> wrong ways, with various "if I run out of resources I'll just do a
> system call" things.
>
> Not to mention that I don't think your patch can work anyway, with
> things like "cmpxchg()" not being something you can do in the vdso
> because it might have the kernel instrumentation in it.

Yea this sharding thing is somewhat faulty. In its current inception, it
also falls over during fork, since the cmpxchg pseudo trylock is
dropped, among other problems Andy and I discussed on IRC. Andy also
suggested not doing the allocation inside of the same function. Florian
brought up the difficulty of even determining the CPU number on arm64.
And also that's a good point about instrumentation on cmpxchg.

So, anyway, if I do muster a v2 of this (perhaps just to see the idea
through), the API might split in two to something like:

void *getrandom_allocate_states([inout] size_t *number_of_states, [out] size_t *length_per_state);
ssize_t getrandom(void *state, void *buffer, size_t len, unsigned long flags);

User code will call getrandom_allocate_state(), which will allocate
enough pages to hold *number_of_states, and return the size of each one
in length_per_state and the number actually allocated back in
number_of_states. The result can then be sliced up by that size, and
passed to getrandom(). So glibc or whatever would presumably allocate
one per thread, and handle any reentrancy/locking around it.

Or some other variation on that. I'm sure you hate those function
signatures. Everybody loves to bikeshed APIs, right? There's plenty to
be tweaked. But that's anyhow about where my thinking is for a potential
v2.

Jason