Re: [PATCH v5] random: clear fast pool, crng, and batches in cpuhp bring up

From: Sebastian Andrzej Siewior
Date: Thu Feb 17 2022 - 12:44:45 EST


On 2022-02-17 13:27:29 [+0100], Jason A. Donenfeld wrote:
> Sebastian - this v5 finally follows your suggestion about what
> operations to do at which phase. The only deviation from your exact
> suggestion is that I'm not checking for MIX_INFLIGHT in the online
> handler, and instead just unconditionally zero it out. I think that's an
> acceptable tradeoff to make for simplicity, and it just means we'll
> accumulate even more entropy, which is fine. Hopefully this is an easy
> ack and has no more pitfalls! -Jason

So I think this is the latest, right?

What do you think about this small comment update? :)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 373af789da7ab..3fb14ac1074c5 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -698,10 +698,6 @@ u32 get_random_u32(void)
EXPORT_SYMBOL(get_random_u32);

#ifdef CONFIG_SMP
-/*
- * This function is called by the cpuhp system, wired up via the large
- * static array in kernel/cpu.c, with the entry CPUHP_RANDOM_PREPARE.
- */
int random_prepare_cpu(unsigned int cpu)
{
/*
@@ -1238,17 +1234,18 @@ static void fast_mix(u32 pool[4])
static DEFINE_PER_CPU(struct fast_pool, irq_randomness);

#ifdef CONFIG_SMP
-/*
- * This function is called by the cpuhp system, wired up via the large
- * static array in kernel/cpu.c, with the entry CPUHP_AP_RANDOM_ONLINE.
- */
int random_online_cpu(unsigned int cpu)
{
/*
- * Set irq randomness count to zero so that new accumulated
- * irqs are fresh, and more importantly, so that its worker
- * is permitted to schedule again when it comes back online,
- * since the MIX_INFLIGHT flag will be cleared.
+ * This function is invoked while the CPU is comming up, after workqueue
+ * subsystem has been fully initialized for this CPU.
+ *
+ * During CPU bring up and shutdown way may schedule a worker
+ * (and set MIX_INFLIGHT) which will run another CPU because workqueues
+ * were not yet ready for this CPU. In this case the worker will
+ * recognize that it runs on the wrong CPU and do nothing.
+ * Therefore count is set unconditionally to zero which will permit
+ * to schedule a new worker soon.
*/
per_cpu_ptr(&irq_randomness, cpu)->count = 0;
return 0;


> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -697,6 +697,25 @@ u32 get_random_u32(void)
> }
> EXPORT_SYMBOL(get_random_u32);
>
> +#ifdef CONFIG_SMP
> +/*
> + * This function is called by the cpuhp system, wired up via the large
> + * static array in kernel/cpu.c, with the entry CPUHP_RANDOM_PREPARE.
> + */
> +int random_prepare_cpu(unsigned int cpu)
> +{
> + /*
> + * When the cpu comes back online, immediately invalidate both
> + * the per-cpu crng and all batches, so that we serve fresh
> + * randomness.
> + */
> + per_cpu_ptr(&crngs, cpu)->generation = ULONG_MAX;
> + per_cpu_ptr(&batched_entropy_u32, cpu)->position = UINT_MAX;
> + per_cpu_ptr(&batched_entropy_u64, cpu)->position = UINT_MAX;

This runs before the CPU is up. Could you do the initialisation right
now?
My problem here is that if this (get_random_u32()) is used between
CPUHP_AP_IDLE_DEAD and CPUHP_TEARDOWN_CPU then the initialisation will
happen on the target CPU with disabled interrupts. And will acquire a
sleeping lock (batched_entropy_u32.lock).

You could perform the initialization cross CPU without the lock because
the CPU itself isn't ready yet. Something like
batch = per_cpu_ptr(&batched_entropy_u32, cpu);
_get_random_bytes(batch->entropy_u32, sizeof(batch->entropy_u32));
batch->position = 0;
batch->generation = next_gen;

should do the job. Plus u64 and I haven't figured out the generation
thingy but I suspect that a sleeping lock is involved.

I did not check if this is a problem on PREEMPT_RT yet but it may become
later one (if we get a user in the hotplug path).

Sebastian