Re: x86/random: Speculation to the rescue

From: Ahmed S. Darwish
Date: Tue Oct 01 2019 - 12:15:19 EST


Hi,

Sorry for the late reply as I'm also on vacation this week :-)

On Sat, Sep 28, 2019 at 04:53:52PM -0700, Linus Torvalds wrote:
> On Sat, Sep 28, 2019 at 3:24 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > Nicholas presented the idea to (ab)use speculative execution for random
> > number generation years ago at the Real-Time Linux Workshop:
>
> What you describe is just a particularly simple version of the jitter
> entropy. Not very reliable.
>
> But hey, here's a made-up patch. It basically does jitter entropy, but
> it uses a more complex load than the fibonacci LFSR folding: it calls
> "schedule()" in a loop, and it sets up a timer to fire.
>
> And then it mixes in the TSC in that loop.
>
> And to be fairly conservative, it then credits one bit of entropy for
> every timer tick. Not because the timer itself would be all that
> unpredictable, but because the interaction between the timer and the
> loop is going to be pretty damn unpredictable.
>
> Ok, I'm handwaving. But I do claim it really is fairly conservative to
> think that a cycle counter would give one bit of entropy when you time
> over a timer actually happening. The way that loop is written, we do
> guarantee that we'll mix in the TSC value both before and after the
> timer actually happened. We never look at the difference of TSC
> values, because the mixing makes that uninteresting, but the code does
> start out with verifying that "yes, the TSC really is changing rapidly
> enough to be meaningful".
>
> So if we want to do jitter entropy, I'd much rather do something like
> this that actually has a known fairly complex load with timers and
> scheduling.
>
> And even if absolutely no actual other process is running, the timer
> itself is still going to cause perturbations. And the "schedule()"
> call is more complicated than the LFSR is anyway.
>
> It does wait for one second the old way before it starts doing this.
>
> Whatever. I'm entirely convinced this won't make everybody happy
> anyway, but it's _one_ approach to handle the issue.
>
> Ahmed - would you be willing to test this on your problem case (with
> the ext4 optimization re-enabled, of course)?
>

So I pulled the patch and the revert of the ext4 revert as they're all
now merged in master. It of course made the problem go away...

To test the quality of the new jitter code, I added a small patch on
top to disable all other sources of randomness except the new jitter
entropy code, [1] and made quick tests on the quality of getrandom(0).

Using the "ent" tool, [2] also used to test randomness in the Stephen
Müller LRNG paper, on a 500000-byte file, produced the following
results:

$ ent rand-file

Entropy = 7.999625 bits per byte.

Optimum compression would reduce the size of this 500000 byte file
by 0 percent.

Chi square distribution for 500000 samples is 259.43, and randomly
would exceed this value 41.11 percent of the times.

Arithmetic mean value of data bytes is 127.4085 (127.5 = random).

Monte Carlo value for Pi is 3.148476594 (error 0.22 percent).

Serial correlation coefficient is 0.001740 (totally uncorrelated = 0.0).

As can be seen above, everything looks random, and almost all of the
statistical randomness tests matched the same kernel without the
"jitter + schedule()" patch added (after getting it un-stuck).

Thanks!

[1] Nullified add_{device,timer,input,interrupt,disk,.*}_randomness()
[2] http://www.fourmilab.ch/random/

--
Ahmed Darwish