Re: [PATCH] x86/entry/64: randomize kernel stack offset upon syscall

From: Peter Zijlstra
Date: Tue Apr 16 2019 - 08:08:53 EST


On Tue, Apr 16, 2019 at 11:10:16AM +0000, Reshetova, Elena wrote:
> >
> > The kernel can execute millions of syscalls per second, I'm pretty sure
> > there's a statistical attack against:
> >
> > * This is a maximally equidistributed combined Tausworthe generator
> > * based on code from GNU Scientific Library 1.5 (30 Jun 2004)
> > *
> > * lfsr113 version:
> > *
> > * x_n = (s1_n ^ s2_n ^ s3_n ^ s4_n)
> > *
> > * s1_{n+1} = (((s1_n & 4294967294) << 18) ^ (((s1_n << 6) ^ s1_n) >> 13))
> > * s2_{n+1} = (((s2_n & 4294967288) << 2) ^ (((s2_n << 2) ^ s2_n) >> 27))
> > * s3_{n+1} = (((s3_n & 4294967280) << 7) ^ (((s3_n << 13) ^ s3_n) >> 21))
> > * s4_{n+1} = (((s4_n & 4294967168) << 13) ^ (((s4_n << 3) ^ s4_n) >> 12))
> > *
> > * The period of this generator is about 2^113 (see erratum paper).
> >
> > ... which recovers the real PRNG state much faster than the ~60 seconds
> > seeding interval and allows the prediction of the next stack offset?
>
> I hope Theodore can comment on bounds here. How many syscalls we need
> to issue assuming that each leaks 5 presudorandom bits out of 32 bit
> presudorandom number produced by PRGN before we can predict the
> PRNG output.

So the argument against using TSC directly was that it might be easy to
guess most of the TSC bits in timing attack. But IIRC there is fairly
solid evidence that the lowest TSC bits are very hard to guess and might
in fact be a very good random source.

So what one could do, is for each invocation mix in the low (2?) bits of
the TSC into a per-cpu/task PRNG state. By always adding some fresh
entropy it would become very hard indeed to predict the outcome, even
for otherwise 'trivial' PRNGs.