Re: [PATCH v10 06/50] x86/sev: Add the host SEV-SNP initialization support

From: Kalra, Ashish
Date: Tue Nov 07 2023 - 14:00:11 EST


Hello Boris,

Addressing of some of the remaining comments:

On 11/7/2023 10:31 AM, Borislav Petkov wrote:
On Mon, Oct 16, 2023 at 08:27:35AM -0500, Michael Roth wrote:
+static bool early_rmptable_check(void)
+{
+ u64 rmp_base, rmp_size;
+
+ /*
+ * For early BSP initialization, max_pfn won't be set up yet, wait until
+ * it is set before performing the RMP table calculations.
+ */
+ if (!max_pfn)
+ return true;

This already says that this is called at the wrong point during init.

Right now we have

early_identify_cpu -> early_init_amd -> early_detect_mem_encrypt

which runs only on the BSP but then early_init_amd() is called in
init_amd() too so that it takes care of the APs too.

Which ends up doing a lot of unnecessary work on each AP in
early_detect_mem_encrypt() like calculating the RMP size on each AP
unnecessarily where this needs to happen exactly once.

Is there any reason why this function cannot be moved to init_amd()
where it'll do the normal, per-AP init?

And the stuff that needs to happen once, needs to be called once too.

+
+ return snp_get_rmptable_info(&rmp_base, &rmp_size);
+}
+
static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
{
u64 msr;
@@ -659,6 +674,9 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
if (!(msr & MSR_K7_HWCR_SMMLOCK))
goto clear_sev;
+ if (cpu_has(c, X86_FEATURE_SEV_SNP) && !early_rmptable_check())
+ goto clear_snp;
+
return;
clear_all:
@@ -666,6 +684,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
clear_sev:
setup_clear_cpu_cap(X86_FEATURE_SEV);
setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
+clear_snp:
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
}
}

...

+bool snp_get_rmptable_info(u64 *start, u64 *len)
+{
+ u64 max_rmp_pfn, calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
+
+ rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
+ rdmsrl(MSR_AMD64_RMP_END, rmp_end);
+
+ if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
+ pr_err("Memory for the RMP table has not been reserved by BIOS\n");
+ return false;
+ }

If you're masking off bits 0-12 above...

+
+ if (rmp_base > rmp_end) {

... why aren't you using the masked out vars further on?

I know, the hw will say, yeah, those bits are 0 but still. IOW, do:

rmp_base &= RMP_ADDR_MASK;
rmp_end &= RMP_ADDR_MASK;

after reading them.

+ pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
+ return false;
+ }
+
+ rmp_sz = rmp_end - rmp_base + 1;
+
+ /*
+ * Calculate the amount the memory that must be reserved by the BIOS to
+ * address the whole RAM, including the bookkeeping area. The RMP itself
+ * must also be covered.
+ */
+ max_rmp_pfn = max_pfn;
+ if (PHYS_PFN(rmp_end) > max_pfn)
+ max_rmp_pfn = PHYS_PFN(rmp_end);
+
+ calc_rmp_sz = (max_rmp_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+ if (calc_rmp_sz > rmp_sz) {
+ pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
+ calc_rmp_sz, rmp_sz);
+ return false;
+ }
+
+ *start = rmp_base;
+ *len = rmp_sz;
+
+ return true;
+}
+
+static __init int __snp_rmptable_init(void)
+{
+ u64 rmp_base, rmp_size;
+ void *rmp_start;
+ u64 val;
+
+ if (!snp_get_rmptable_info(&rmp_base, &rmp_size))
+ return 1;
+
+ pr_info("RMP table physical address [0x%016llx - 0x%016llx]\n",

That's "RMP table physical range"

+ rmp_base, rmp_base + rmp_size - 1);
+
+ rmp_start = memremap(rmp_base, rmp_size, MEMREMAP_WB);
+ if (!rmp_start) {
+ pr_err("Failed to map RMP table addr 0x%llx size 0x%llx\n", rmp_base, rmp_size);

No need to dump rmp_base and rmp_size again here - you're dumping them
above.

+ return 1;
+ }
+
+ /*
+ * Check if SEV-SNP is already enabled, this can happen in case of
+ * kexec boot.
+ */
+ rdmsrl(MSR_AMD64_SYSCFG, val);
+ if (val & MSR_AMD64_SYSCFG_SNP_EN)
+ goto skip_enable;
+
+ /* Initialize the RMP table to zero */

Again: useless comment.

+ memset(rmp_start, 0, rmp_size);
+
+ /* Flush the caches to ensure that data is written before SNP is enabled. */
+ wbinvd_on_all_cpus();
+
+ /* MFDM must be enabled on all the CPUs prior to enabling SNP. */

First of all, use the APM bit name here pls: MtrrFixDramModEn.

And then, for the life of me, I can't find any mention in the APM why
this bit is needed. Neither in "15.36.2 Enabling SEV-SNP" nor in
"15.34.3 Enabling SEV".

Looking at the bit defintions of WrMem an RdMem - read and write
requests get directed to system memory instead of MMIO so I guess you
don't want to be able to write MMIO for certain physical ranges when SNP
is enabled but it'll be good to have this properly explained instead of
a "this must happen" information-less sentence.

This is a per-requisite for SNP_INIT as per the SNP Firmware ABI specifications, section 8.8.2:

From the SNP FW ABI specs:

If INIT_RMP is 1, then the firmware ensures the following system requirements are met:
• SYSCFG[MemoryEncryptionModEn] must be set to 1 across all cores. (SEV must be
enabled.)
• SYSCFG[SecureNestedPagingEn] must be set to 1 across all cores.
• SYSCFG[VMPLEn] must be set to 1 across all cores.
• SYSCFG[MFDM] must be set to 1 across all cores.
• VM_HSAVE_PA (MSR C001_0117) must be set to 0h across all cores.
• HWCR[SmmLock] (MSR C001_0015) must be set to 1 across all cores.

So, this platform enabling code for SNP needs to ensure that these conditions are met before SNP_INIT is called.


+ on_each_cpu(mfd_enable, NULL, 1);
+
+ /* Enable SNP on all CPUs. */

Useless comment.

+ on_each_cpu(snp_enable, NULL, 1);
+
+skip_enable:
+ rmp_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
+ rmp_size -= RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+ rmptable_start = (struct rmpentry *)rmp_start;
+ rmptable_max_pfn = rmp_size / sizeof(struct rmpentry) - 1;
+
+ return 0;
+}
+
+static int __init snp_rmptable_init(void)
+{
+ int family, model;
+
+ if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+ return 0;
+
+ family = boot_cpu_data.x86;
+ model = boot_cpu_data.x86_model;

Looks useless - just use boot_cpu_data directly below.

As mentioned here already https://lore.kernel.org/all/Y9ubi0i4Z750gdMm@xxxxxxx/

And I already mentioned that for v9:

https://lore.kernel.org/r/20230621094236.GZZJLGDAicp1guNPvD@fat_crate.local

Next time I'm NAKing this patch until you incorporate all review
comments or you give a technical reason why you disagree with them.

+ /*
+ * RMP table entry format is not architectural and it can vary by processor and
+ * is defined by the per-processor PPR. Restrict SNP support on the known CPU
+ * model and family for which the RMP table entry format is currently defined for.
+ */
+ if (family != 0x19 || model > 0xaf)
+ goto nosnp;
+
+ if (amd_iommu_snp_enable())
+ goto nosnp;
+
+ if (__snp_rmptable_init())
+ goto nosnp;
+
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
+
+ return 0;
+
+nosnp:
+ setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+ return -ENOSYS;
+}
+
+/*
+ * This must be called after the PCI subsystem. This is because amd_iommu_snp_enable()
+ * is called to ensure the IOMMU supports the SEV-SNP feature, which can only be
+ * called after subsys_initcall().
+ *
+ * NOTE: IOMMU is enforced by SNP to ensure that hypervisor cannot program DMA
+ * directly into guest private memory. In case of SNP, the IOMMU ensures that
+ * the page(s) used for DMA are hypervisor owned.
+ */
+fs_initcall(snp_rmptable_init);

This looks backwards. AFAICT, the IOMMU code should call arch code to
enable SNP at the right time, not the other way around - arch code
calling driver code.

Especially if the SNP table enablement depends on some exact IOMMU
init_state:

if (init_state > IOMMU_ENABLED) {
pr_err("SNP: Too late to enable SNP for IOMMU.\n");



This is again as per SNP_INIT requirements, that SNP support is enabled in the IOMMU before SNP_INIT is done. The above function snp_rmptable_init() only calls the IOMMU driver to enable SNP support when it has detected and enabled platform support for SNP.

It is not that IOMMU driver has to call the arch code to enable SNP at the right time but it is the other way around that once platform support for SNP is enabled then the IOMMU driver has to be called to enable the same for the IOMMU and this needs to be done before the CCP driver is loaded and does SNP_INIT.

Thanks,
Ashish