Re: [GIT PULL] KVM: x86: MMU changes for 6.9

From: Paolo Bonzini
Date: Mon Mar 11 2024 - 10:31:13 EST


On 3/8/24 23:36, Sean Christopherson wrote:
The bulk of the changes are TDP MMU improvements related to memslot deletion
(ChromeOS has a use case that "requires" frequent deletion of a GPU buffer).
The other highlight is allocating the write-tracking metadata on-demand, e.g.
so that distro kernels pay the memory cost of the arrays if and only if KVM
or KVMGT actually needs to shadow guest page tables.

The following changes since commit 41bccc98fb7931d63d03f326a746ac4d429c1dd3:

Linux 6.8-rc2 (2024-01-28 17:01:12 -0800)

are available in the Git repository at:

https://github.com/kvm-x86/linux.git tags/kvm-x86-mmu-6.9

for you to fetch changes up to a364c014a2c1ad6e011bc5fdb8afb9d4ba316956:

kvm/x86: allocate the write-tracking metadata on-demand (2024-02-27 11:49:54 -0800)

Pulled, thanks.

Paolo

----------------------------------------------------------------
KVM x86 MMU changes for 6.9:

- Clean up code related to unprotecting shadow pages when retrying a guest
instruction after failed #PF-induced emulation.

- Zap TDP MMU roots at 4KiB granularity to minimize the delay in yielding if
a reschedule is needed, e.g. if a high priority task needs to run. Because
KVM doesn't support yielding in the middle of processing a zapped non-leaf
SPTE, zapping at 1GiB granularity can result in multi-millisecond lag when
attempting to schedule in a high priority.

- Rework TDP MMU root unload, free, and alloc to run with mmu_lock held for
read, e.g. to avoid serializing vCPUs when userspace deletes a memslot.

- Allocate write-tracking metadata on-demand to avoid the memory overhead when
running kernels built with KVMGT support (external write-tracking enabled),
but for workloads that don't use nested virtualization (shadow paging) or
KVMGT.

----------------------------------------------------------------
Andrei Vagin (1):
kvm/x86: allocate the write-tracking metadata on-demand

Kunwu Chan (1):
KVM: x86/mmu: Use KMEM_CACHE instead of kmem_cache_create()

Mingwei Zhang (1):
KVM: x86/mmu: Don't acquire mmu_lock when using indirect_shadow_pages as a heuristic

Sean Christopherson (10):
KVM: x86: Drop dedicated logic for direct MMUs in reexecute_instruction()
KVM: x86: Drop superfluous check on direct MMU vs. WRITE_PF_TO_SP flag
KVM: x86/mmu: Zap invalidated TDP MMU roots at 4KiB granularity
KVM: x86/mmu: Don't do TLB flush when zappings SPTEs in invalid roots
KVM: x86/mmu: Allow passing '-1' for "all" as_id for TDP MMU iterators
KVM: x86/mmu: Skip invalid roots when zapping leaf SPTEs for GFN range
KVM: x86/mmu: Skip invalid TDP MMU roots when write-protecting SPTEs
KVM: x86/mmu: Check for usable TDP MMU root while holding mmu_lock for read
KVM: x86/mmu: Alloc TDP MMU roots while holding mmu_lock for read
KVM: x86/mmu: Free TDP MMU roots while holding mmy_lock for read

arch/x86/include/asm/kvm_host.h | 9 +++
arch/x86/kvm/mmu/mmu.c | 37 +++++++-----
arch/x86/kvm/mmu/page_track.c | 68 +++++++++++++++++++++-
arch/x86/kvm/mmu/tdp_mmu.c | 124 ++++++++++++++++++++++++++++------------
arch/x86/kvm/mmu/tdp_mmu.h | 2 +-
arch/x86/kvm/x86.c | 35 +++++-------
6 files changed, 201 insertions(+), 74 deletions(-)