Re: [PATCH 1/1] arm64: syscall: Direct PRNG kstack randomization

From: Arnd Bergmann
Date: Wed Mar 06 2024 - 15:46:36 EST


On Wed, Mar 6, 2024, at 00:33, Kees Cook wrote:
> On Tue, Mar 05, 2024 at 04:18:24PM -0600, Jeremy Linton wrote:
>> The existing arm64 stack randomization uses the kernel rng to acquire
>> 5 bits of address space randomization. This is problematic because it
>> creates non determinism in the syscall path when the rng needs to be
>> generated or reseeded. This shows up as large tail latencies in some
>> benchmarks and directly affects the minimum RT latencies as seen by
>> cyclictest.
>>
>> Other architectures are using timers/cycle counters for this function,
>> which is sketchy from a randomization perspective because it should be
>> possible to estimate this value from knowledge of the syscall return
>> time, and from reading the current value of the timer/counters.

As I commented on the previous version, I don't want to see
a change that only addresses one architecture like this. If you
are convinced that using a cycle counter is a mistake, then we
should do the same thing on the other architectures as well
that currently use a cycle counter.

>> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
>> +DEFINE_PER_CPU(struct rnd_state, kstackrng);
>> +
>> +static u16 kstack_rng(void)
>> +{
>> + u32 rng = prandom_u32_state(this_cpu_ptr(&kstackrng));
>> +
>> + return rng & 0x1ff;
>> +}
>> +
>> +/* Should we reseed? */
>> +static int kstack_rng_setup(unsigned int cpu)
>> +{
>> + u32 rng_seed;
>> +
>> + /* zero should be avoided as a seed */
>> + do {
>> + rng_seed = get_random_u32();
>> + } while (!rng_seed);
>> + prandom_seed_state(this_cpu_ptr(&kstackrng), rng_seed);
>> + return 0;
>> +}
>> +
>> +static int kstack_init(void)
>> +{
>> + int ret;
>> +
>> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize",
>> + kstack_rng_setup, NULL);
>
> This will run initial seeding, but don't we need to reseed this with
> some kind of frequency?

Won't that defeat the purpose of the patch that was intended
to make the syscall latency more predictable? At least the
simpler approaches of reseeding from the kstack_rng()
function itself would have this problem, deferring it to
another context comes with a separate set of problems.

Arnd