Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests

From: Kevin Loughlin
Date: Fri Mar 08 2024 - 15:50:22 EST


On Fri, Mar 8, 2024 at 5:31 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Thu, 22 Feb 2024 at 21:25, Kevin Loughlin <kevinloughlin@xxxxxxxxxx> wrote:
> >
> > SEV-SNP requires encrypted memory to be validated before access.
> > Because the ROM memory range is not part of the e820 table, it is not
> > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > to access this range, the guest must first validate the range.
> >
> > The current SEV-SNP code does indeed scan the ROM range during early
> > boot and thus attempts to validate the ROM range in probe_roms().
> > However, this behavior is neither necessary nor sufficient.
> >
> > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > falls in the ROM range) prior to validation. The specific problematic
> > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > in a crash during boot if SEV-SNP is enabled under these conditions.
> >
> > With regards to necessity, SEV-SNP guests currently read garbage (which
> > changes across boots) from the ROM range, meaning these scans are
> > unnecessary. The guest reads garbage because the legacy ROM range
> > is unencrypted data but is accessed via an encrypted PMD during early
> > boot (where the PMD is marked as encrypted due to potentially mapping
> > actually-encrypted data in other PMD-contained ranges).
> >
> > While one solution would be to overhaul the early PMD mapping to treat
> > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > data from the legacy ROM region during early boot (nor can they
> > currently, since the data would be garbage that changes across boots).
> > As such, this patch opts for the simpler approach of skipping the ROM
> > range scans (and the otherwise-necessary range validation) during
> > SEV-SNP guest early boot.
> >
> > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > validation is avoided by simply not accessing the ROM range.
> >
> > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > Signed-off-by: Kevin Loughlin <kevinloughlin@xxxxxxxxxx>
> > ---
> > arch/x86/include/asm/sev.h | 2 --
> > arch/x86/kernel/mpparse.c | 7 +++++++
> > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > arch/x86/kernel/sev.c | 15 ---------------
> > drivers/firmware/dmi_scan.c | 7 ++++++-
> > 5 files changed, 17 insertions(+), 25 deletions(-)
> >
>
> Agree with the analysis and the conclusion. However, this will need to
> be split into generic and x86 specific changes, given that the DMI
> code is shared between all architectures, and explicitly checking for
> SEV-SNP support in generic code is not appropriate.
>
> So what we will need is:
> - a generic change that implements a static inline wrapper around
> IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK), and wires it up
> in drivers/firmware/dmi_scan.c;
> - a x86 specific change that overrides this DMI helper in terms of
> cc_platform_has(CC_ATTR_GUEST_SEV_SNP);
> - x86 specific changes that deal with the other scanning
>
> Note that this means that Oak based platforms will lose DMI reporting
> and DMI-based quirks, but I think this is reasonable.

Agreed. However, upon further review, I think we can get away with
only modifying arch/x86/ code.

Besides the DMI case, all other needed changes are already contained
in arch/x86/, and we can replace the relevant init functions for
SEV-SNP guests with empty stubs as Boris and you mention in our
discussion.

For the DMI case, we can add an x86-init function pointer to
dmi_setup() that defaults to the generic dmi_setup function(), which
would be modified to point to snp_dmi_setup() on SNP-enabled guests
during initialization (where the fallback scan would be skipped for
SNP guests). This way, we would both leave multi-arch code alone and
avoid spreading cc_platform_has() scans around as Boris mentioned.

I plan to implement this behavior in v3 unless you have a preference
for something different.

>
> More feedback below.
>
>
> > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > index 5b4a1ce3d368..474c24ba0f6f 100644
> > --- a/arch/x86/include/asm/sev.h
> > +++ b/arch/x86/include/asm/sev.h
> > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > unsigned long npages);
> > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > unsigned long npages);
> > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > void snp_set_wakeup_secondary_cpu(void);
> > @@ -227,7 +226,6 @@ static inline void __init
> > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > static inline void __init
> > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > index b223922248e9..39ea771e2d4c 100644
> > --- a/arch/x86/kernel/mpparse.c
> > +++ b/arch/x86/kernel/mpparse.c
> > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > base, base + length - 1);
> > BUILD_BUG_ON(sizeof(*mpf) != 16);
> >
> > + /*
> > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > + * as this memory is not pre-validated and would thus cause a crash.
> > + */
> > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > + return 0;
> > +
>
> Please don't use magic numbers like this, and use memory_intersects()
> [unless there is a reason to avoid it which I missed]

Yes, memory_intersects() is better, as are macros here. Thanks.

>
> Also, really?!? Does modern x86 still rely on scanning arbitrary
> regions of memory for magic numbers? Or is this only for those who
> prefer vintage boot protocols?
>
> If so, I suppose we might need a generic helper
>
> static inline bool platform_allows_memory_probing(void)
>
> [modulo bikeshedding over the name] where the generic implementation
> returns false, and the x86 implementation could take
> cc_platform_has(CC_ATTR_GUEST_SEV_SNP) into account, and return true
> otherwise.
>
> (On ARM based systems, memory probing is never ok, because the memory
> map is not architected, and so probing random addresses might bring
> down the machine)

Roughly-speaking, the x86 memory probes are generally performed to
support legacy devices/reserved regions/boot sequences that assume
these hardcoded addresses. Given the ability to point probe_roms() and
similar x86_init functions to empty stubs (and the fact that x86_init
functions are, by definition, x86-specific), we should be able to
avoid needing a "platform_allows_memory_probing()" function in these
cases.

As for the DMI probing behavior in dmi_scan_machine(), the probing
only currently occurs if both (a) the config tables are not provided
by EFI [i.e., `efi_enabled(EFI_CONFIG_TABLES)` is false] and (b)
DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set [which is not selected on
ARM, consistent with memory probing on ARM being disallowed]. As such,
DMI_SCAN_MACHINE_NON_EFI_FALLBACK effectively provides the
"platform_allows_memory_probing" functionality for this singular use
case.