RE: [PATCH v3] x86/coco: Require seeding RNG with RDRAND on CoCo systems

From: Reshetova, Elena
Date: Wed Feb 21 2024 - 09:35:09 EST



> There are few uses of CoCo that don't rely on working cryptography and
> hence a working RNG. Unfortunately, the CoCo threat model means that the
> VM host cannot be trusted and may actively work against guests to
> extract secrets or manipulate computation. Since a malicious host can
> modify or observe nearly all inputs to guests, the only remaining source
> of entropy for CoCo guests is RDRAND.
>
> If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole
> is meant to gracefully continue on gathering entropy from other sources,
> but since there aren't other sources on CoCo, this is catastrophic.
> This is mostly a concern at boot time when initially seeding the RNG, as
> after that the consequences of a broken RDRAND are much more
> theoretical.
>
> So, try at boot to seed the RNG using 256 bits of RDRAND output. If this
> fails, panic(). This will also trigger if the system is booted without
> RDRAND, as RDRAND is essential for a safe CoCo boot.
>
> This patch is deliberately written to be "just a CoCo x86 driver
> feature" and not part of the RNG itself. Many device drivers and
> platforms have some desire to contribute something to the RNG, and
> add_device_randomness() is specifically meant for this purpose. Any
> driver can call this with seed data of any quality, or even garbage
> quality, and it can only possibly make the quality of the RNG better or
> have no effect, but can never make it worse. Rather than trying to
> build something into the core of the RNG, this patch interprets the
> particular CoCo issue as just a CoCo issue, and therefore separates this
> all out into driver (well, arch/platform) code.
>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Daniel P. Berrangé <berrange@xxxxxxxxxx>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Elena Reshetova <elena.reshetova@xxxxxxxxx>
> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> Cc: Theodore Ts'o <tytso@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>

Reviewed-by: Elena Reshetova <elena.reshetova@xxxxxxxxx>

> ---
> Changes v2->v3:
> - Remove patch that handled generic RDRAND failures, because that
> doesn't really have any implication for the RNG, since it's supposed
> to run fine on systems without RDRAND anyway, and CoCo is a weird
> special case. If people still want an extra generic RDRAND failure
> handler, that's standalone anyway, so we can do that disconnected from
> this patch. No need to make it a series.
> - Update comments and commit message to reflect this.
>
> Changes v1->v2:
> - panic() instead of BUG_ON(), as suggested by Andi Kleen.
> - Update comments, now that we have info from AMD and Intel.
>
> arch/x86/coco/core.c | 36 ++++++++++++++++++++++++++++++++++++
> arch/x86/include/asm/coco.h | 2 ++
> arch/x86/kernel/setup.c | 2 ++
> 3 files changed, 40 insertions(+)
>
> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> index eeec9986570e..0a5d59966d6d 100644
> --- a/arch/x86/coco/core.c
> +++ b/arch/x86/coco/core.c
> @@ -3,13 +3,16 @@
> * Confidential Computing Platform Capability checks
> *
> * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + * Copyright (C) 2024 Jason A. Donenfeld <Jason@xxxxxxxxx>. All Rights
> Reserved.
> *
> * Author: Tom Lendacky <thomas.lendacky@xxxxxxx>
> */
>
> #include <linux/export.h>
> #include <linux/cc_platform.h>
> +#include <linux/random.h>
>
> +#include <asm/archrandom.h>
> #include <asm/coco.h>
> #include <asm/processor.h>
>
> @@ -153,3 +156,36 @@ __init void cc_set_mask(u64 mask)
> {
> cc_mask = mask;
> }
> +
> +__init void cc_random_init(void)
> +{
> + unsigned long rng_seed[32 / sizeof(long)];
> + size_t i, longs;
> +
> + if (cc_vendor == CC_VENDOR_NONE)
> + return;
> +
> + /*
> + * Since the CoCo threat model includes the host, the only reliable
> + * source of entropy that can be neither observed nor manipulated is
> + * RDRAND. Usually, RDRAND failure is considered tolerable, but since
> + * CoCo guests have no other unobservable source of entropy, it's
> + * important to at least ensure the RNG gets some initial random seeds.
> + */
> + for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
> + longs = arch_get_random_longs(&rng_seed[i],
> ARRAY_SIZE(rng_seed) - i);
> +
> + /*
> + * A zero return value means that the guest doesn't have RDRAND
> + * or the CPU is physically broken, and in both cases that
> + * means most crypto inside of the CoCo instance will be
> + * broken, defeating the purpose of CoCo in the first place. So
> + * just panic here because it's absolutely unsafe to continue
> + * executing.
> + */
> + if (longs == 0)
> + panic("RDRAND is defective.");
> + }
> + add_device_randomness(rng_seed, sizeof(rng_seed));
> + memzero_explicit(rng_seed, sizeof(rng_seed));
> +}
> diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
> index 76c310b19b11..e9d059449885 100644
> --- a/arch/x86/include/asm/coco.h
> +++ b/arch/x86/include/asm/coco.h
> @@ -15,6 +15,7 @@ extern enum cc_vendor cc_vendor;
> void cc_set_mask(u64 mask);
> u64 cc_mkenc(u64 val);
> u64 cc_mkdec(u64 val);
> +void cc_random_init(void);
> #else
> #define cc_vendor (CC_VENDOR_NONE)
>
> @@ -27,6 +28,7 @@ static inline u64 cc_mkdec(u64 val)
> {
> return val;
> }
> +static inline void cc_random_init(void) { }
> #endif
>
> #endif /* _ASM_X86_COCO_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 84201071dfac..30a653cfc7d2 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -36,6 +36,7 @@
> #include <asm/bios_ebda.h>
> #include <asm/bugs.h>
> #include <asm/cacheinfo.h>
> +#include <asm/coco.h>
> #include <asm/cpu.h>
> #include <asm/efi.h>
> #include <asm/gart.h>
> @@ -994,6 +995,7 @@ void __init setup_arch(char **cmdline_p)
> * memory size.
> */
> mem_encrypt_setup_arch();
> + cc_random_init();
>
> efi_fake_memmap();
> efi_find_mirror();
> --
> 2.43.0
>