Re: [PATCH v10 23/50] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe

From: Kalra, Ashish
Date: Mon Dec 11 2023 - 19:00:31 EST


Hello Vlastimil,

On 12/11/2023 7:24 AM, Vlastimil Babka wrote:
On 10/16/23 15:27, Michael Roth wrote:
From: Brijesh Singh <brijesh.singh@xxxxxxx>

Implement a workaround for an SNP erratum where the CPU will incorrectly
signal an RMP violation #PF if a hugepage (2mb or 1gb) collides with the
RMP entry of a VMCB, VMSA or AVIC backing page.

When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
backing pages as "in-use" via a reserved bit in the corresponding RMP
entry after a successful VMRUN. This is done for _all_ VMs, not just
SNP-Active VMs.

If the hypervisor accesses an in-use page through a writable
translation, the CPU will throw an RMP violation #PF. On early SNP
hardware, if an in-use page is 2mb aligned and software accesses any
part of the associated 2mb region with a hupage, the CPU will
incorrectly treat the entire 2mb region as in-use and signal a spurious
RMP violation #PF.

The recommended is to not use the hugepage for the VMCB, VMSA or
AVIC backing page for similar reasons. Add a generic allocator that will
ensure that the page returns is not hugepage (2mb or 1gb) and is safe to

This is a bit confusing wording as we are not avoiding "using a
hugepage" but AFAIU, avoiding using a (4k) page that has a hugepage
aligned physical address, right?

Yes.


be used when SEV-SNP is enabled. Also implement similar handling for the
VMCB/VMSA pages of nested guests.

Co-developed-by: Marc Orr <marcorr@xxxxxxxxxx>
Signed-off-by: Marc Orr <marcorr@xxxxxxxxxx>
Reported-by: Alper Gun <alpergun@xxxxxxxxxx> # for nested VMSA case
Co-developed-by: Ashish Kalra <ashish.kalra@xxxxxxx>
Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
[mdr: squash in nested guest handling from Ashish]
Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
---

<snip>

+
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+ unsigned long pfn;
+ struct page *p;
+
+ if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+ return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+
+ /*
+ * Allocate an SNP safe page to workaround the SNP erratum where
+ * the CPU will incorrectly signal an RMP violation #PF if a
+ * hugepage (2mb or 1gb) collides with the RMP entry of VMCB, VMSA
+ * or AVIC backing page. The recommeded workaround is to not use the
+ * hugepage.

Same here "not use the hugepage"

+ *
+ * Allocate one extra page, use a page which is not 2mb aligned
+ * and free the other.

This makes more sense.

+ */
+ p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+ if (!p)
+ return NULL;
+
+ split_page(p, 1);
> Yeah I think that's a sensible use of split_page(), as we don't have
support for forcefully non-aligned allocations or specific "page
coloring" in the page allocator.

Yes, using split_page() allows us to free the additionally allocated page individually.

Thanks,
Ashish

So even with my wording concerns:

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>