Re: [PATCH 5.10 001/452] arm64: Initialize jump labels before setup_machine_fdt()

From: Ard Biesheuvel
Date: Wed Jun 08 2022 - 03:28:49 EST


On Tue, 7 Jun 2022 at 19:28, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> Hi Greg,
>
> On Tue, Jun 07, 2022 at 06:57:38PM +0200, Greg Kroah-Hartman wrote:
> > From: Stephen Boyd <swboyd@xxxxxxxxxxxx>
> >
> > commit 73e2d827a501d48dceeb5b9b267a4cd283d6b1ae upstream.
> >
> > A static key warning splat appears during early boot on arm64 systems
> > that credit randomness from devicetrees that contain an "rng-seed"
> > property. This is because setup_machine_fdt() is called before
> > jump_label_init() during setup_arch(). Let's swap the order of these two
> > calls so that jump labels are initialized before the devicetree is
> > unflattened and the rng seed is credited.
> >
> > static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init()
> > WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8
> > Modules linked in:
> > CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff
> > pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : static_key_enable_cpuslocked+0xb0/0xb8
> > lr : static_key_enable_cpuslocked+0xb0/0xb8
> > sp : ffffffe51c393cf0
> > x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10
> > x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000
> > x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000
> > x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020
> > x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708
> > x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000
> > x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000
> > x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027
> > x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05
> > x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065
> > Call trace:
> > static_key_enable_cpuslocked+0xb0/0xb8
> > static_key_enable+0x2c/0x40
> > crng_set_ready+0x24/0x30
> > execute_in_process_context+0x80/0x90
> > _credit_init_bits+0x100/0x154
> > add_bootloader_randomness+0x64/0x78
> > early_init_dt_scan_chosen+0x140/0x184
> > early_init_dt_scan_nodes+0x28/0x4c
> > early_init_dt_scan+0x40/0x44
> > setup_machine_fdt+0x7c/0x120
> > setup_arch+0x74/0x1d8
> > start_kernel+0x84/0x44c
> > __primary_switched+0xc0/0xc8
> > ---[ end trace 0000000000000000 ]---
> > random: crng init done
> > Machine model: Google Lazor (rev1 - 2) with LTE
> >
> > Cc: Hsin-Yi Wang <hsinyi@xxxxxxxxxxxx>
> > Cc: Douglas Anderson <dianders@xxxxxxxxxxxx>
> > Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Cc: Jason A. Donenfeld <Jason@xxxxxxxxx>
> > Cc: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
> > Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
> > Signed-off-by: Stephen Boyd <swboyd@xxxxxxxxxxxx>
> > Reviewed-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> > Link: https://lore.kernel.org/r/20220602022109.780348-1-swboyd@xxxxxxxxxxxx
> > Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>
> Since Jason asked for the fixed commit (f5bda35fba61) to be reverted in
> stable, please don't push this arm64 patch either. Given the risks of
> breakage as on arm32 (it doesn't look like but you never know), I'm
> tempted to revert it from mainline as well if Jason finds a better
> solution for the early crng_reseed() call.
>

So as I understand it, there are basically three options:
0. don't use a static branch in the RNG code
1. use a static branch but don't patch it extremely early
2. fix all the arch code so that it is safe to patch static branches
extremely early.

We have been digging into the ARM code yesterday, and identified that
RISC-V needs to be fixed as well. In fact, every arch that calls
early_init_dt_scan() from setup_arch() will need to be vetted to
ensure that jump label patching is possible this early.

Jason already proposed an implementation of 1) here

https://lore.kernel.org/lkml/20220607100210.683136-1-Jason@xxxxxxxxx/

which seems to me to be the most suitable approach by far, given that
it removes the need to fiddle with very early boot code on many
different architectures. That would also allow the arm64 of /this/
patch to be reverted from mainline.