Re: [PATCH v2 1/3] KVM: x86: add a new page track hook track_remove_slot

From: Yan Zhao
Date: Fri Nov 11 2022 - 19:26:22 EST


On Fri, Nov 11, 2022 at 06:19:15PM +0000, Sean Christopherson wrote:
> TL;DR: I'm going to try to add more aggressive patches for this into my series to
> clean up the KVM side of things, along with many more patches to clean up the page
> track APIs.
>
> I'll post patches next week if things go well (fingers crossed), and if not I'll
> give an update
>
> On Fri, Nov 11, 2022, Yan Zhao wrote:
> > Page track hook track_remove_slot is used to notify users that a slot
> > has been removed and is called when a slot DELETE/MOVE is about to be
> > completed.
>
> Phrase this as a command, and explain _why_ the new hook is being added, e.g.
>
> Add a new page track hook, track_remove_slot(), that is called when a
> memslot DELETE/MOVE operation is about to be committed. The "remove"
> hook will be used by KVMGT and will effectively replace the existing
> track_flush_slot() altogether now that KVM itself doesn't rely on the
> "flush" hook either.
>
> The "flush" hook is flawed as it's invoked before the memslot operation
> is guaranteed, i.e. KVM might ultimately keep the existing memslot without
> notifying external page track users, a.k.a. KVMGT.
>
> > Users of this hook can drop write protections in the removed slot.
>
> Hmm, actually, on second thought, after thinking about what KVGT is doing in
> response to the memslot update, I think we should be more aggressive and actively
> prevent MOVE if there are external page trackers, i.e. if KVMGT is attached.
>
> Dropping write protections when a memslot is being deleted is a waste of cycles.
> The memslot and thus gfn_track is literally being deleted immediately after invoking
> the hook, updating gfn_track from KVMGT is completely unecessary.

> I.e. if we kill off the MOVE path, then KVMGT just needs to delete its hash table
> entry.
>
> Oooh! Looking at this code again made me realize that the larger page track cleanup
> that I want to do might actually work. Long story short, I want to stop forcing
> KVMGT to poke into KVM internals, but I thought there was a lock inversion problem.
>
> But AFAICT, there is no such problem. And the cleanup I want to do will actually
> fix an existing KVMGT bug: kvmgt_page_track_write() invokes kvmgt_gfn_is_write_protected()
> without holding mmu_lock, and thus could consume garbage when walking the hash
> table.
>
> static void kvmgt_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa,
> const u8 *val, int len,
> struct kvm_page_track_notifier_node *node)
> {
> struct intel_vgpu *info =
> container_of(node, struct intel_vgpu, track_node);
>
> if (kvmgt_gfn_is_write_protected(info, gpa_to_gfn(gpa)))
> intel_vgpu_page_track_handler(info, gpa,
> (void *)val, len);
> }
>
> Acquiring mmu_lock isn't an option as intel_vgpu_page_track_handler() might sleep,
> e.g. when acquiring vgpu_lock.
>
I totally agree with you and actually had the same feeling as you when
examined the code yesterday. But I thought I'd better send this series
first and do the cleanup later :)
And I'm also not sure if a slots_arch_lock is required for
kvm_slot_page_track_add_page() and kvm_slot_page_track_remove_page().
(Though it doesn't matter for a removing slot and we actually needn't to
call kvm_slot_page_track_remove_page() in response to a slot removal,
the two interfaces are still required elsewhere.)


> Let me see if the clean up I have in mind will actually work. If it does, I think
> the end result will be quite nice for both KVM and KVMGT.
Yes, it would be great.

Thanks
Yan