Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests

From: Michael Roth
Date: Fri Mar 08 2024 - 15:44:41 EST


On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@xxxxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@xxxxxxxxxx> wrote:
> > >
> > > SEV-SNP requires encrypted memory to be validated before access.
> > > Because the ROM memory range is not part of the e820 table, it is not
> > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > to access this range, the guest must first validate the range.
> > >
> > > The current SEV-SNP code does indeed scan the ROM range during early
> > > boot and thus attempts to validate the ROM range in probe_roms().
> > > However, this behavior is neither necessary nor sufficient.
> > >
> > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > falls in the ROM range) prior to validation. The specific problematic
> > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > >
> > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > changes across boots) from the ROM range, meaning these scans are
> > > unnecessary. The guest reads garbage because the legacy ROM range
> > > is unencrypted data but is accessed via an encrypted PMD during early
> > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > actually-encrypted data in other PMD-contained ranges).
> > >
> > > While one solution would be to overhaul the early PMD mapping to treat
> > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > data from the legacy ROM region during early boot (nor can they
> > > currently, since the data would be garbage that changes across boots).
> > > As such, this patch opts for the simpler approach of skipping the ROM
> > > range scans (and the otherwise-necessary range validation) during
> > > SEV-SNP guest early boot.
> > >
> > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > validation is avoided by simply not accessing the ROM range.
> > >
> > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > Signed-off-by: Kevin Loughlin <kevinloughlin@xxxxxxxxxx>
> > > ---
> > > arch/x86/include/asm/sev.h | 2 --
> > > arch/x86/kernel/mpparse.c | 7 +++++++
> > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > arch/x86/kernel/sev.c | 15 ---------------
> > > drivers/firmware/dmi_scan.c | 7 ++++++-
> > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > --- a/arch/x86/include/asm/sev.h
> > > +++ b/arch/x86/include/asm/sev.h
> > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > unsigned long npages);
> > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > unsigned long npages);
> > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > void snp_set_wakeup_secondary_cpu(void);
> > > @@ -227,7 +226,6 @@ static inline void __init
> > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > static inline void __init
> > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > index b223922248e9..39ea771e2d4c 100644
> > > --- a/arch/x86/kernel/mpparse.c
> > > +++ b/arch/x86/kernel/mpparse.c
> > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > > base, base + length - 1);
> > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > >
> > > + /*
> > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > + * as this memory is not pre-validated and would thus cause a crash.
> > > + */
> > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > + return 0;
> > > +
> > > while (length > 0) {
> > > bp = early_memremap(base, length);
> > > mpf = (struct mpf_intel *)bp;
> > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > index 319fef37d9dc..84ff4b052fc1 100644
> > > --- a/arch/x86/kernel/probe_roms.c
> > > +++ b/arch/x86/kernel/probe_roms.c
> > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > int i;
> > >
> > > /*
> > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > - * Do that here.
> > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > + * is not pre-validated, meaning access would cause a crash.
> > > */
> > > - snp_prep_memory(video_rom_resource.start,
> > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > - SNP_PAGE_STATE_PRIVATE);
> > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > + return;
> > >
> > > /* video rom */
> > > upper = adapter_rom_resources[0].start;
> > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > index c67285824e82..d2362631da91 100644
> > > --- a/arch/x86/kernel/sev.c
> > > +++ b/arch/x86/kernel/sev.c
> > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > }
> > >
> > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > -{
> > > - unsigned long vaddr, npages;
> > > -
> > > - vaddr = (unsigned long)__va(paddr);
> > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > -
> > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > - else
> > > - WARN(1, "invalid memory op %d\n", op);
> > > -}
> > > -
> > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > > unsigned long vaddr_end, int op)
> > > {
> > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > index 015c95a825d3..22e27087eb5b 100644
> > > --- a/drivers/firmware/dmi_scan.c
> > > +++ b/drivers/firmware/dmi_scan.c
> > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > dmi_available = 1;
> > > return;
> > > }
> > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > + /*
> > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > + * is not pre-validated, meaning access would cause a crash.
> > > + */
> > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > if (p == NULL)
> > > goto error;
> > > --
> > > 2.44.0.rc0.258.g7320e95886-goog
> > >
> > >
> >
> > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
>
> Thanks. I will update this in v3.

There's also another access a bit later in boot:

static __init int eisa_bus_probe(void)
{
...
ioremap(0x0FFFD9, 4);
}

This time it's via ioremap() with the encryption bit *unset*, so it
won't necessarily cause a crash but it's inconsistent with the early
page table having that region set as encrypted.

We discussed unsetting the encryption bit in early page table with
security folks and the general consensus was that *if* any VMM/firmware
ever came along that does want to make use of legacy region for any reason
(such as providing DMI/SMBIOS info) it would be safest to require that they
encrypt the data in the region before handing off to guest kernel, so it
makes sense to patch away unecrypted accesses to the legacy region so the
don't cause problems down the road (like causing implicit page state
change from private->shared and throwing away data in the region later
in boot).

-Mike