Re: [PATCH v2 5/8] KVM: x86/mmu: Use separate namespaces for guest PTEs and shadow PTEs

From: Paolo Bonzini
Date: Wed Jun 15 2022 - 10:02:05 EST


On 6/15/22 01:33, Sean Christopherson wrote:
Separate the macros for KVM's shadow PTEs (SPTE) from guest 64-bit PTEs
(PT64). SPTE and PT64 are _mostly_ the same, but the few differences are
quite critical, e.g. *_BASE_ADDR_MASK must differentiate between host and
guest physical address spaces, and SPTE_PERM_MASK (was PT64_PERM_MASK) is
very much specific to SPTEs.

Opportunistically (and temporarily) move most guest macros into paging.h
to clearly associate them with shadow paging, and to ensure that they're
not used as of this commit. A future patch will eliminate them entirely.

Sadly, PT32_LEVEL_BITS is left behind in mmu_internal.h because it's
needed for the quadrant calculation in kvm_mmu_get_page(). The quadrant
calculation is hot enough (when using shadow paging with 32-bit guests)
that adding a per-context helper is undesirable, and burying the
computation in paging_tmpl.h with a forward declaration isn't exactly an
improvement.

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>

A better try:

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 54b3e39d07b3..cd561b49cc84 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2011,8 +2011,21 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
role.direct = direct;
role.access = access;
if (role.has_4_byte_gpte) {
+ /*
+ * The "quadrant" value corresponds to those bits of the address
+ * that have already been used by the 8-byte shadow page table
+ * lookup, but not yet in the 4-byte guest page tables. Having
+ * the quadrant as part of the role ensures that each upper sPTE
+ * points to the the correct portion of the guest page table
+ * structure.
+ *
+ * For example, a 4-byte PDE consumes bits 31:22 and an 8-byte PDE
+ * consumes bits 29:21. Each guest PD must be expanded into four
+ * shadow PDs, one for each value of bits 31:30, and the PDPEs
+ * will use the four quadrants in round-robin fashion.
+ */
quadrant = gaddr >> (PAGE_SHIFT + (SPTE_LEVEL_BITS * level));
- quadrant &= (1 << ((PT32_LEVEL_BITS - SPTE_LEVEL_BITS) * level)) - 1;
+ quadrant &= (1 << level) - 1;
role.quadrant = quadrant;
}
if (level <= vcpu->arch.mmu->cpu_role.base.level)
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index cb9d4d358335..5e1e3c8f8aaa 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -20,9 +20,6 @@ extern bool dbg;
#define MMU_WARN_ON(x) do { } while (0)
#endif
-/* The number of bits for 32-bit PTEs is to needed compute the quandrant. */
-#define PT32_LEVEL_BITS 10
-
/* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
#define __PT_LEVEL_SHIFT(level, bits_per_level) \
(PAGE_SHIFT + ((level) - 1) * (bits_per_level))
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 6c29aef4092b..e4655056e651 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -38,7 +38,7 @@
#define pt_element_t u32
#define guest_walker guest_walker32
#define FNAME(name) paging##32_##name
- #define PT_LEVEL_BITS PT32_LEVEL_BITS
+ #define PT_LEVEL_BITS 10
#define PT_MAX_FULL_LEVELS 2
#define PT_GUEST_DIRTY_SHIFT PT_DIRTY_SHIFT
#define PT_GUEST_ACCESSED_SHIFT PT_ACCESSED_SHIFT

Paolo