Re: [PATCH v2] x86, random: Fix get_random_bytes() warning in x86 start_kernel

From: Theodore Y. Ts'o
Date: Fri Feb 08 2019 - 12:44:01 EST


On Fri, Feb 08, 2019 at 08:14:45AM -0500, Prarit Bhargava wrote:
>
> Yes, that's exactly the case. During early boot we initialize the boot cpu's
> stack canary at arch/x86/include/asm/stackprotector.h:75 which is well before
> the random driver is initialized. The same code is called for all other cpus,
> so perhaps not calling get_random_bytes() for the boot cpu is another option.

The other thing we can do is to build some method of seeding the
random number generator into the boot loader. This is what OpenBSD
does, and the advantage of solving this problem in that way it also
means that we have good randomness for things like kASLR.

There are two ways this can be done. One is read entropy seed from
some kind of secure store (or at least, trusted store). This couldt t
just be teaching grub to read from /var/local/lib/random-state. (But
if we do this, as soon as possible, we should overwrite random-state
with new randomness the moment the CRNG is initialized.) This is
basically what OpenBSD does.

The other thing that the bootloader might do is to get secure
cryptographic randomness from the boot environment. For example, UEFI
has such a resource that can be tapped.

Either in the bootloader, or in the kernel, we could create stub
device drivers that try to connect to a TPM or some equivalent secure
chip (e.g., such as Gotogle Titan chip on certain Google hardware
platforms) that doesn't require bringing up full the kernel
infrastructure (especially if it's in the kernel, it can't depend on
kmalloc, or workqueues, etc.) This is because it needs to run before
we relocate the kernel (since kASLR needs secure randomness). So
effectively even if it's in the kernel, so it can more easily updated
to support new hardware, these stub drivers needs to be able to
function in a very similar limited set of runtime infrastructure, much
like something that might be in a bootloader.

> Perhaps I'm confusing you by changing the comment. I'm not changing any
> behaviour wrt TSC. The current code "relies" on the TSC as much as it did
> before (all the way back through the git history).

If we are only relying on TSC, we *must* not silence the warning. To
the extent that your commit description talking about silencing
warnings, that's just raised all sorts of red flags. Maybe I
misinterpreted your intent, but talking about relying on TSC is just
the wrong attitude if it's going to silence warnings.

Warnings are not necessarily bad. Warnings can spur people to make
the kernel better, and in the case of the random driver, it may mean
doing work so we can improve the quality of the entropy that we
provide, especially at early boot.

So it's not about silencing warnigs. The warning is a hint that we
should strive to do better. It's much like pain. Pain is gooid; it
tells you that something is wrong. If someone has broken their leg,
the focus should not be on silencing the pain with morphine. Maybe we
do need to do that, but if morphine is given without also setting the
broken leg and fixing the underlying problem that triggered the pain,
the painkillers can be worse the useless.

Sorry for really harping on this, but the philosophical bent which is
implied by the commit description is something really wrong, and it's
especially scary if it drives how we modify the code. Please, let's
focus on improving the quality and the security of the randomness we
provide to the kernel --- not silencing warnings.

Thanks,

- Ted