Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

From: Jason A. Donenfeld
Date: Mon Feb 07 2022 - 16:57:58 EST


On Mon, Feb 7, 2022 at 10:50 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
> > But the "lfsr" variable is never accessed outside the part of this
> > method that holds a global spinlock. So that can't really be it,
> > right?

Oh, hm, yea. Seems likely.

>
> There is a data race in crng_ready(), it just loads from "crng_init"
> without READ_ONCE()... maybe that's what KCSAN is noticing?

There are lots of data races in crng_ready(), which Dominik has been
fixing up gradually (which is why I CC'd him). However, crng_init is 4
bytes, and that crng_init is read first with a cmpl that will read the
whole thing. Maybe KCSAN's reporting is wrong? But it also says 0x00
-> 0x43, which isn't one of the assigned values of crng_init. Also,
0x20d seems quite far into the body of add_device_randomness, whereas
crng_ready() is checked sort of early on.

Hopefully Paul will send us his vmlinux.

Another possibility is that this is happening on the u8 read of the
buffer *input*, of what release_task->__exit_signal is passing it:

add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
sizeof(unsigned long long));

I haven't yet looked at the locking around tsk->se.sum_exec_runtime.