Re: [PATCH 1/4] KVM: Always flush async #PF workqueue when vCPU is being destroyed

From: Xu Yilun
Date: Mon Feb 19 2024 - 09:03:11 EST


> void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> @@ -114,7 +132,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> #else
> if (cancel_work_sync(&work->work)) {
> mmput(work->mm);
> - kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
> kmem_cache_free(async_pf_cache, work);
> }
> #endif
> @@ -126,7 +143,18 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> list_first_entry(&vcpu->async_pf.done,
> typeof(*work), link);
> list_del(&work->link);
> - kmem_cache_free(async_pf_cache, work);
> +
> + spin_unlock(&vcpu->async_pf.lock);
> +
> + /*
> + * The async #PF is "done", but KVM must wait for the work item
> + * itself, i.e. async_pf_execute(), to run to completion. If
> + * KVM is a module, KVM must ensure *no* code owned by the KVM
> + * (the module) can be run after the last call to module_put(),
> + * i.e. after the last reference to the last vCPU's file is put.
> + */
> + kvm_flush_and_free_async_pf_work(work);

I have a new concern when I re-visit this patchset.

Form kvm_check_async_pf_completion(), I see async_pf.queue is always a
superset of async_pf.done (except wake-all work, which is not within
concern). And done work would be skipped from sync (cancel_work_sync()) by:

if (!work->vcpu)
continue;

But now with this patch we also sync done works, how about we just sync all
queued work instead.

diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index e033c79d528e..2268f16a36c2 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -71,7 +71,6 @@ static void async_pf_execute(struct work_struct *work)
spin_lock(&vcpu->async_pf.lock);
first = list_empty(&vcpu->async_pf.done);
list_add_tail(&apf->link, &vcpu->async_pf.done);
- apf->vcpu = NULL;
spin_unlock(&vcpu->async_pf.lock);

if (!IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC) && first)
@@ -101,13 +100,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
typeof(*work), queue);
list_del(&work->queue);

- /*
- * We know it's present in vcpu->async_pf.done, do
- * nothing here.
- */
- if (!work->vcpu)
- continue;
-
spin_unlock(&vcpu->async_pf.lock);
#ifdef CONFIG_KVM_ASYNC_PF_SYNC
flush_work(&work->work);


This way we don't have to sync twice for the same purpose, also we could
avoid reusing work->vcpu as a "done" flag which confused me a bit.

Thanks,
Yilun

> + spin_lock(&vcpu->async_pf.lock);
> }
> spin_unlock(&vcpu->async_pf.lock);
>