[PATCH 1/4] KVM: x86/mmu: Track the number entries in a pte_list_desc with a ulong

From: Sean Christopherson
Date: Fri Jun 24 2022 - 19:27:45 EST


Use an "unsigned long" instead of a "u64" to track the number of entries
in a pte_list_desc's sptes array. Both sizes are overkill as the number
of entries would easily fit into a u8, the goal is purely to get sptes[]
aligned and to size the struct as a whole to be a multiple of a cache
line (64 bytes).

Using a u64 on 32-bit kernels fails on both accounts as "more" is only
4 bytes. Dropping "spte_count" to 4 bytes on 32-bit kernels fixes the
alignment issue and the overall size.

Add a compile-time assert to ensure the size of pte_list_desc stays a
multiple of the cache line size on modern CPUs (hardcoded because
L1_CACHE_BYTES is configurable via CONFIG_X86_L1_CACHE_SHIFT).

Fixes: 13236e25ebab ("KVM: X86: Optimize pte_list_desc with per-array counter")
Cc: Peter Xu <peterx@xxxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/mmu/mmu.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bd74a287b54a..17ac30b9e22c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -117,15 +117,17 @@ module_param(dbg, bool, 0644);
/*
* Slight optimization of cacheline layout, by putting `more' and `spte_count'
* at the start; then accessing it will only use one single cacheline for
- * either full (entries==PTE_LIST_EXT) case or entries<=6.
+ * either full (entries==PTE_LIST_EXT) case or entries<=6. On 32-bit kernels,
+ * the entire struct fits in a single cacheline.
*/
struct pte_list_desc {
struct pte_list_desc *more;
/*
- * Stores number of entries stored in the pte_list_desc. No need to be
- * u64 but just for easier alignment. When PTE_LIST_EXT, means full.
+ * The number of valid entries in sptes[]. Use an unsigned long to
+ * naturally align sptes[] (a u8 for the count would suffice). When
+ * equal to PTE_LIST_EXT, this particular list is full.
*/
- u64 spte_count;
+ unsigned long spte_count;
u64 *sptes[PTE_LIST_EXT];
};

@@ -5640,6 +5642,9 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
tdp_root_level = tdp_forced_root_level;
max_tdp_level = tdp_max_root_level;

+ BUILD_BUG_ON_MSG((sizeof(struct pte_list_desc) % 64),
+ "pte_list_desc is not a multiple of cache line size (on modern CPUs)");
+
/*
* max_huge_page_level reflects KVM's MMU capabilities irrespective
* of kernel support, e.g. KVM may be capable of using 1GB pages when
--
2.37.0.rc0.161.g10f37bed90-goog