Re: [PATCH 1/4] KVM: Always flush async #PF workqueue when vCPU is being destroyed

From: Xu Yilun
Date: Mon Feb 19 2024 - 22:06:35 EST


On Mon, Feb 19, 2024 at 07:51:24AM -0800, Sean Christopherson wrote:
> On Mon, Feb 19, 2024, Xu Yilun wrote:
> > > void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > > @@ -114,7 +132,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > > #else
> > > if (cancel_work_sync(&work->work)) {
> > > mmput(work->mm);
> > > - kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
> > > kmem_cache_free(async_pf_cache, work);
> > > }
> > > #endif
> > > @@ -126,7 +143,18 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > > list_first_entry(&vcpu->async_pf.done,
> > > typeof(*work), link);
> > > list_del(&work->link);
> > > - kmem_cache_free(async_pf_cache, work);
> > > +
> > > + spin_unlock(&vcpu->async_pf.lock);
> > > +
> > > + /*
> > > + * The async #PF is "done", but KVM must wait for the work item
> > > + * itself, i.e. async_pf_execute(), to run to completion. If
> > > + * KVM is a module, KVM must ensure *no* code owned by the KVM
> > > + * (the module) can be run after the last call to module_put(),
> > > + * i.e. after the last reference to the last vCPU's file is put.
> > > + */
> > > + kvm_flush_and_free_async_pf_work(work);
> >
> > I have a new concern when I re-visit this patchset.
> >
> > Form kvm_check_async_pf_completion(), I see async_pf.queue is always a
> > superset of async_pf.done (except wake-all work, which is not within
> > concern). And done work would be skipped from sync (cancel_work_sync()) by:
> >
> > if (!work->vcpu)
> > continue;
> >
> > But now with this patch we also sync done works, how about we just sync all
> > queued work instead.
>
> Hmm, IIUC, I think we can simply revert commit 22583f0d9c85 ("KVM: async_pf: avoid
> recursive flushing of work items").

Ah, yes. This also make me clear about the history of the confusing spin_lock.
Reverting is good to me.

Thanks,
Yilun