Re: [PATCH v4 2/2] random: defer fast pool mixing to worker

From: Sebastian Andrzej Siewior
Date: Fri Feb 11 2022 - 09:18:29 EST


On 2022-02-11 09:25:00 [+0100], Dominik Brodowski wrote:
> That's a hw rng (such as a tpm chip or the virtio-rng driver) providing some
> entropy; if it's 64 bytes of input, crng_init progresses to 1, and
> crng_fast_load() should never be called again.[*] I'm a bit suprised that the
> hw_rng input occurred so early (it's only at device_initcall() level), and
> earlier than 64 interrupts. But that may differ from system to system.
>
> Note that crng_fast_load() will also never be called from
> add_interrupt_randomness() if
>
> EFI, DT or kexec provides bootloader entropy of at least 64 bytes,
> and CONFIG_RANDOM_TRUST_BOOTLOADER is set
>
> and/or CONFIG_RANDOM_TRUST_CPU is set and the RDRAND/RDSEED instructions do
> not fail.
>
> If neither of these three conditions (hw_rng is run early, bootloader or CPU
> randomness) are met, the initial and early seeding of the base_crng depends
> on add_interrupt_randomness(), and should happen rather quickly.

Right.

> > I did move that crng_fast_load() into the worker and did made some
> > numbers:
> > <idle>-0 [000] d..h1.. 2.069924: add_interrupt_randomness: Tick
> >
> > first interrupt
> > …
> > swapper/0-1 [000] d..h.11 2.341938: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h.11 2.341938: add_interrupt_randomness: work
> >
> > the 64th interrupt, scheduling the worker.
> >
> > swapper/0-1 [000] d..h.11 2.345937: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h111 2.349938: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h.11 2.353939: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h.11 2.357940: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h111 2.361939: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h111 2.365939: add_interrupt_randomness: Tick
> > swapper/0-1 [000] d..h.11 2.369941: add_interrupt_randomness: Tick
> > kworker/0:0H-6 [000] ....... 2.384714: mix_interrupt_randomness: load
> > kworker/0:0H-6 [000] ....... 2.384715: crng_fast_load: 16
> > <idle>-0 [001] dn.h1.. 3.205766: add_interrupt_randomness: Tick
> > <idle>-0 [019] dn.h1.. 6.771047: add_interrupt_randomness: Tick
> >
> > 7 interrupts got lost before the worker could run & load first 16 bytes.
> > The workqueue core gets initialized at that point and spawns first
> > worker.
>
> So the reason for the longer delay here is that the workqueue core had not
> been initialized beforehand?

Kind of, yes. workqueue_init_early() happens quite early so we don't
have to worry about NULL pointer for system_highpri_wq. The worker
started after workqueue_init() populated the worker pools.

> > After that the interrupts took a break.
> > And then the work-to-load delay was quite low:
> >
> > <idle>-0 [019] dn.h1.. 7.586234: add_interrupt_randomness: Tick
> > <idle>-0 [019] dn.h1.. 7.586234: add_interrupt_randomness: work
> > kworker/19:0H-175 [019] ....... 7.586504: mix_interrupt_randomness: load
> > kworker/19:0H-175 [019] ....... 7.586507: crng_fast_load: 16
> > <idle>-0 [020] dn.h1.. 7.614649: add_interrupt_randomness: Tick
> > <idle>-0 [020] dn.h1.. 7.614651: add_interrupt_randomness: work
> > <idle>-0 [020] dn.h1.. 7.614736: add_interrupt_randomness: Tick
> > kworker/20:0H-183 [020] dn.h... 7.614859: add_interrupt_randomness: Tick
> > kworker/20:0H-183 [020] ....... 7.614871: mix_interrupt_randomness: load
> > kworker/20:0H-183 [020] ....... 7.614872: crng_fast_load: 16
> > <idle>-0 [018] dn.h1.. 8.352423: add_interrupt_randomness: Tick
> > <idle>-0 [018] dn.h1.. 8.352423: add_interrupt_randomness: work
> > kworker/18:0H-167 [018] dn.h1.. 8.352438: add_interrupt_randomness: Tick
> > kworker/18:0H-167 [018] dn.h1.. 8.352448: add_interrupt_randomness: Tick
> > kworker/18:0H-167 [018] dn.h1.. 8.352459: add_interrupt_randomness: Tick
> > kworker/18:0H-167 [018] dn.h1.. 8.352491: add_interrupt_randomness: Tick
> > kworker/18:0H-167 [018] ....... 8.352505: mix_interrupt_randomness: load
> > kworker/18:0H-167 [018] ....... 8.352506: crng_fast_load: 16
> >
> > In total we lost 13 ticks.
>
> Was this still way before the initramfs was up and running?

After unpacked initramfs. From current boot:

| [ 5.901462] Unpacking initramfs...
| [ 6.758747] sd 1:0:0:0: [sda] Attached SCSI disk
| [ 7.886651] Freeing initrd memory: 9532K
| [ 7.893753] Freeing unused kernel image (initmem) memory: 2184K
| [ 7.963519] Write protecting the kernel read-only data: 20480k
| [ 7.971465] Freeing unused kernel image (text/rodata gap) memory: 2032K
| [ 7.979711] Freeing unused kernel image (rodata/data gap) memory: 1980K
| [ 7.987132] Run /init as init process
| Loading, please wait...
|
| Starting version 250.3-2
| [ 8.157529] igb 0000:07:00.0 eno0: renamed from eth0
| [ 8.203836] igb 0000:07:00.1 enp7s0f1: renamed from eth1
| [ 8.219489] random: fast init done
| Begin: Loading essential drivers ... done.
| Begin: Running /scripts/init-premount ... done.
| Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
| Begin: Running /scripts/local-premount ... done.
| Warning: fsck not present, so skipping root file[ 8.337554] XFS (sda2): Mounting V5 Filesystem
| [ 8.392151] XFS (sda2): Ending clean mount
| done.
| Begin: Running /scripts/local-bottom ... done.
| Begin: Running /scripts/init-bottom ... done.
| [ 8.540708] systemd[1]: systemd 250.3-2 running in system mode …
| [ 12.207227] random: crng init done

I have to note that on this system there is a initramfs but all drivers
are built-in (you see sd, net _after_ "Unpacking initramfs" but there
are not drivers to load).

> Thanks,
> Dominik

Sebastian