Re: [PATCH v2] kvm/x86: allocate the write-tracking metadata on-demand

From: Andrei Vagin
Date: Tue Feb 13 2024 - 14:32:43 EST


On Tue, Feb 13, 2024 at 9:13 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Tue, Feb 06, 2024, Andrei Vagin wrote:
> > The write-track is used externally only by the gpu/drm/i915 driver.
> > Currently, it is always enabled, if a kernel has been compiled with this
> > driver.
> >
> > Enabling the write-track mechanism adds a two-byte overhead per page across
> > all memory slots. It isn't significant for regular VMs. However in gVisor,
> > where the entire process virtual address space is mapped into the VM, even
> > with a 39-bit address space, the overhead amounts to 256MB.
> >
> > This change rework the write-tracking mechanism to enable it on-demand
> > in kvm_page_track_register_notifier.
>
> Don't use "this change", "this patch", or any other variant of "this blah" that
> you come up with. :-) Simply phrase the changelog as a command.

ok:)

>
> > Here is Sean's comment about the locking scheme:
> >
> > The only potential hiccup would be if taking slots_arch_lock would
> > deadlock, but it should be impossible for slots_arch_lock to be taken in
> > any other path that involves VFIO and/or KVMGT *and* can be coincident.
> > Except for kvm_arch_destroy_vm() (which deletes KVM's internal
> > memslots), slots_arch_lock is taken only through KVM ioctls(), and the
> > caller of kvm_page_track_register_notifier() *must* hold a reference to
> > the VM.
> >
> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
> > Cc: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx>
> > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > Signed-off-by: Andrei Vagin <avagin@xxxxxxxxxx>
> > ---
> > v1: https://lore.kernel.org/lkml/ZcErs9rPqT09nNge@xxxxxxxxxx/T/
> > v2: allocate the write-tracking metadata on-demand
> >
> > arch/x86/include/asm/kvm_host.h | 2 +
> > arch/x86/kvm/mmu/mmu.c | 24 +++++------
> > arch/x86/kvm/mmu/page_track.c | 74 ++++++++++++++++++++++++++++-----
> > arch/x86/kvm/mmu/page_track.h | 3 +-
> > 4 files changed, 78 insertions(+), 25 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index d271ba20a0b2..c35641add93c 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1503,6 +1503,8 @@ struct kvm_arch {
> > */
> > #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1)
> > struct kvm_mmu_memory_cache split_desc_cache;
> > +
> > + bool page_write_tracking_enabled;
>
> Rather than a generic page_write_tracking_enabled, I think it makes sense to
> explicitly track if there are *external* write tracking users. One could argue
> it makes the total tracking *too* fine grained, but I think it would be helpful
> for readers to when KVM itself is using write tracking (shadow paging) versus
> when KVM has write tracking enabled, but has not allocated rmaps (external write
> tracking user).
>
> That way, kernels with CONFIG_KVM_EXTERNAL_WRITE_TRACKING=n don't need to check
> the bool (though they'll still check kvm_shadow_root_allocated()). And as a
> bonus, the diff is quite a bit smaller.
>

Your patch looks good to me. I ran kvm and gvisor tests and didn't
find any issues. I sent it as v3:
https://lkml.org/lkml/2024/2/13/1181

I didn't do any changes, so feel free to change the author.

Thanks for the help.