Re: [PATCH v2] x86/kernel: skip ROM range scans and validation for SEV-SNP guests

From: Michael Roth
Date: Fri Mar 08 2024 - 18:01:36 EST


On Fri, Mar 08, 2024 at 04:30:55PM -0500, Kevin Loughlin wrote:
> On Fri, Mar 8, 2024 at 3:44 PM Michael Roth <michael.roth@xxxxxxx> wrote:
> >
> > On Fri, Mar 08, 2024 at 11:10:43AM -0500, Kevin Loughlin wrote:
> > > On Mon, Feb 26, 2024 at 2:16 PM Mike Stunes <mike.stunes@xxxxxxxxxxxx> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > On Feb 22, 2024, at 12:24 PM, Kevin Loughlin <kevinloughlin@xxxxxxxxxx> wrote:
> > > > >
> > > > > SEV-SNP requires encrypted memory to be validated before access.
> > > > > Because the ROM memory range is not part of the e820 table, it is not
> > > > > pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
> > > > > to access this range, the guest must first validate the range.
> > > > >
> > > > > The current SEV-SNP code does indeed scan the ROM range during early
> > > > > boot and thus attempts to validate the ROM range in probe_roms().
> > > > > However, this behavior is neither necessary nor sufficient.
> > > > >
> > > > > With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
> > > > > CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
> > > > > attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
> > > > > falls in the ROM range) prior to validation. The specific problematic
> > > > > call chain occurs during dmi_setup() -> dmi_scan_machine() and results
> > > > > in a crash during boot if SEV-SNP is enabled under these conditions.
> > > > >
> > > > > With regards to necessity, SEV-SNP guests currently read garbage (which
> > > > > changes across boots) from the ROM range, meaning these scans are
> > > > > unnecessary. The guest reads garbage because the legacy ROM range
> > > > > is unencrypted data but is accessed via an encrypted PMD during early
> > > > > boot (where the PMD is marked as encrypted due to potentially mapping
> > > > > actually-encrypted data in other PMD-contained ranges).
> > > > >
> > > > > While one solution would be to overhaul the early PMD mapping to treat
> > > > > the ROM region of the PMD as unencrypted, SEV-SNP guests do not rely on
> > > > > data from the legacy ROM region during early boot (nor can they
> > > > > currently, since the data would be garbage that changes across boots).
> > > > > As such, this patch opts for the simpler approach of skipping the ROM
> > > > > range scans (and the otherwise-necessary range validation) during
> > > > > SEV-SNP guest early boot.
> > > > >
> > > > > Ultimatly, the potential SEV-SNP guest crash due to lack of ROM range
> > > > > validation is avoided by simply not accessing the ROM range.
> > > > >
> > > > > Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
> > > > > Signed-off-by: Kevin Loughlin <kevinloughlin@xxxxxxxxxx>
> > > > > ---
> > > > > arch/x86/include/asm/sev.h | 2 --
> > > > > arch/x86/kernel/mpparse.c | 7 +++++++
> > > > > arch/x86/kernel/probe_roms.c | 11 ++++-------
> > > > > arch/x86/kernel/sev.c | 15 ---------------
> > > > > drivers/firmware/dmi_scan.c | 7 ++++++-
> > > > > 5 files changed, 17 insertions(+), 25 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > > > > index 5b4a1ce3d368..474c24ba0f6f 100644
> > > > > --- a/arch/x86/include/asm/sev.h
> > > > > +++ b/arch/x86/include/asm/sev.h
> > > > > @@ -203,7 +203,6 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
> > > > > unsigned long npages);
> > > > > void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > > > unsigned long npages);
> > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
> > > > > void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
> > > > > void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
> > > > > void snp_set_wakeup_secondary_cpu(void);
> > > > > @@ -227,7 +226,6 @@ static inline void __init
> > > > > early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > static inline void __init
> > > > > early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
> > > > > -static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
> > > > > static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
> > > > > static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
> > > > > static inline void snp_set_wakeup_secondary_cpu(void) { }
> > > > > diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
> > > > > index b223922248e9..39ea771e2d4c 100644
> > > > > --- a/arch/x86/kernel/mpparse.c
> > > > > +++ b/arch/x86/kernel/mpparse.c
> > > > > @@ -553,6 +553,13 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
> > > > > base, base + length - 1);
> > > > > BUILD_BUG_ON(sizeof(*mpf) != 16);
> > > > >
> > > > > + /*
> > > > > + * Skip scan in SEV-SNP guest if it would touch the legacy ROM region,
> > > > > + * as this memory is not pre-validated and would thus cause a crash.
> > > > > + */
> > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && base < 0x100000 && base + length >= 0xC0000)
> > > > > + return 0;
> > > > > +
> > > > > while (length > 0) {
> > > > > bp = early_memremap(base, length);
> > > > > mpf = (struct mpf_intel *)bp;
> > > > > diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
> > > > > index 319fef37d9dc..84ff4b052fc1 100644
> > > > > --- a/arch/x86/kernel/probe_roms.c
> > > > > +++ b/arch/x86/kernel/probe_roms.c
> > > > > @@ -204,14 +204,11 @@ void __init probe_roms(void)
> > > > > int i;
> > > > >
> > > > > /*
> > > > > - * The ROM memory range is not part of the e820 table and is therefore not
> > > > > - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
> > > > > - * memory, and SNP requires encrypted memory to be validated before access.
> > > > > - * Do that here.
> > > > > + * These probes are skipped in SEV-SNP guests because the ROM range
> > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > */
> > > > > - snp_prep_memory(video_rom_resource.start,
> > > > > - ((system_rom_resource.end + 1) - video_rom_resource.start),
> > > > > - SNP_PAGE_STATE_PRIVATE);
> > > > > + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > > > > + return;
> > > > >
> > > > > /* video rom */
> > > > > upper = adapter_rom_resources[0].start;
> > > > > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > > > > index c67285824e82..d2362631da91 100644
> > > > > --- a/arch/x86/kernel/sev.c
> > > > > +++ b/arch/x86/kernel/sev.c
> > > > > @@ -774,21 +774,6 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
> > > > > early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED);
> > > > > }
> > > > >
> > > > > -void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
> > > > > -{
> > > > > - unsigned long vaddr, npages;
> > > > > -
> > > > > - vaddr = (unsigned long)__va(paddr);
> > > > > - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> > > > > -
> > > > > - if (op == SNP_PAGE_STATE_PRIVATE)
> > > > > - early_snp_set_memory_private(vaddr, paddr, npages);
> > > > > - else if (op == SNP_PAGE_STATE_SHARED)
> > > > > - early_snp_set_memory_shared(vaddr, paddr, npages);
> > > > > - else
> > > > > - WARN(1, "invalid memory op %d\n", op);
> > > > > -}
> > > > > -
> > > > > static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> > > > > unsigned long vaddr_end, int op)
> > > > > {
> > > > > diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> > > > > index 015c95a825d3..22e27087eb5b 100644
> > > > > --- a/drivers/firmware/dmi_scan.c
> > > > > +++ b/drivers/firmware/dmi_scan.c
> > > > > @@ -703,7 +703,12 @@ static void __init dmi_scan_machine(void)
> > > > > dmi_available = 1;
> > > > > return;
> > > > > }
> > > > > - } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK)) {
> > > > > + } else if (IS_ENABLED(CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK) &&
> > > > > + !cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> > > > > + /*
> > > > > + * This scan is skipped in SEV-SNP guests because the ROM range
> > > > > + * is not pre-validated, meaning access would cause a crash.
> > > > > + */
> > > > > p = dmi_early_remap(SMBIOS_ENTRY_POINT_SCAN_START, 0x10000);
> > > > > if (p == NULL)
> > > > > goto error;
> > > > > --
> > > > > 2.44.0.rc0.258.g7320e95886-goog
> > > > >
> > > > >
> > > >
> > > > In addition to these changes, I also had to skip pirq_find_routing_table if SEV-SNP is active.
> > >
> > > Thanks. I will update this in v3.
> >
> > There's also another access a bit later in boot:
> >
> > static __init int eisa_bus_probe(void)
> > {
> > ...
> > ioremap(0x0FFFD9, 4);
> > }
> >
> > This time it's via ioremap() with the encryption bit *unset*, so it
> > won't necessarily cause a crash but it's inconsistent with the early
> > page table having that region set as encrypted.
> >
> > We discussed unsetting the encryption bit in early page table with
> > security folks and the general consensus was that *if* any VMM/firmware
> > ever came along that does want to make use of legacy region for any reason
> > (such as providing DMI/SMBIOS info) it would be safest to require that they
> > encrypt the data in the region before handing off to guest kernel, so it
> > makes sense to patch away unecrypted accesses to the legacy region so the
> > don't cause problems down the road (like causing implicit page state
> > change from private->shared and throwing away data in the region later
> > in boot).
>
> Sounds good, thanks. Since this one won't cause crashes, I will place
> it in a separate patch in the series to separate (current) functional
> fixes from cleanup, especially since there may be similar legacy
> probes to cleanup in various types of guests. Please let me know if
> you feel differently or have additional thoughts.

I think it could still be argued that it's a fix. It's just that the
main set of fixes avoid reading garbage for VMM/firmwares that *don't*
encrypt these regions, whereas this additional fix handles the case for
VMM/firmwares that *do* encrypt these regions. It's possible they exist
in the case of SEV (though I don't know of any). Might still make sense
to distinguish the 2 cases since latter is more theoretical, but both
still address the kernel modifying its behavior based on scanning random
garbage for strings.

-Mike