Re: [PATCH 1/1] arm64: syscall: Direct PRNG kstack randomization

From: Kees Cook
Date: Tue Mar 05 2024 - 18:34:03 EST


On Tue, Mar 05, 2024 at 04:18:24PM -0600, Jeremy Linton wrote:
> The existing arm64 stack randomization uses the kernel rng to acquire
> 5 bits of address space randomization. This is problematic because it
> creates non determinism in the syscall path when the rng needs to be
> generated or reseeded. This shows up as large tail latencies in some
> benchmarks and directly affects the minimum RT latencies as seen by
> cyclictest.
>
> Other architectures are using timers/cycle counters for this function,
> which is sketchy from a randomization perspective because it should be
> possible to estimate this value from knowledge of the syscall return
> time, and from reading the current value of the timer/counters.
>
> So, a poor rng should be better than the cycle counter if it is hard
> to extract the stack offsets sufficiently to be able to detect the
> PRNG's period. Lets downgrade from get_random_u16() to
> prandom_u32_state() under the theory that the danger of someone
> guessing the 1 in 32 per call offset, is larger than that of being
> able to extract sufficient history to accurately predict future
> offsets. Further it should be safer to run with prandom_u32_state than
> disabling stack randomization for those subset of applications where the
> difference in latency is on the order of ~5X worse.
>
> Reported-by: James Yang <james.yang@xxxxxxx>
> Reported-by: Shiyou Huang <shiyou.huang@xxxxxxx>
> Signed-off-by: Jeremy Linton <jeremy.linton@xxxxxxx>
> ---
> arch/arm64/kernel/syscall.c | 42 ++++++++++++++++++++++++++++++++++++-
> 1 file changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 9a70d9746b66..33b3ea4adff8 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -5,6 +5,7 @@
> #include <linux/errno.h>
> #include <linux/nospec.h>
> #include <linux/ptrace.h>
> +#include <linux/prandom.h>
> #include <linux/randomize_kstack.h>
> #include <linux/syscalls.h>
>
> @@ -37,6 +38,45 @@ static long __invoke_syscall(struct pt_regs *regs, syscall_fn_t syscall_fn)
> return syscall_fn(regs);
> }
>
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> +DEFINE_PER_CPU(struct rnd_state, kstackrng);
> +
> +static u16 kstack_rng(void)
> +{
> + u32 rng = prandom_u32_state(this_cpu_ptr(&kstackrng));
> +
> + return rng & 0x1ff;
> +}
> +
> +/* Should we reseed? */
> +static int kstack_rng_setup(unsigned int cpu)
> +{
> + u32 rng_seed;
> +
> + /* zero should be avoided as a seed */
> + do {
> + rng_seed = get_random_u32();
> + } while (!rng_seed);
> + prandom_seed_state(this_cpu_ptr(&kstackrng), rng_seed);
> + return 0;
> +}
> +
> +static int kstack_init(void)
> +{
> + int ret;
> +
> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize",
> + kstack_rng_setup, NULL);

This will run initial seeding, but don't we need to reseed this with
some kind of frequency?

Otherwise, seems fine to me.

--
Kees Cook