Re: async_pf.c && use_mm() (Was: mm,vmacache: also flush cache for VM_CLONE)

From: Oleg Nesterov
Date: Fri Mar 14 2014 - 14:25:00 EST


On 03/13, Linus Torvalds wrote:
>
> Ok, no longer on my phone, and no, it clearly does the reference count with a
>
> atomic_inc(&work->mm->mm_count);
>
> separately. The use_mm/unuse_mm seems entirely specious.

Yes, it really looks as if we can simply remove it.

But once again, with or without use_mm() it seems that the refcounting
is buggy. get_user_pages() is simply wrong if ->mm_users == 0 and
exit_mmap/etc was already called (or in progress).

So I think we need something like below, but I can't test this change
or audit other (potential) users of kvm_async_pf->mm.

Perhaps this is not a bug and somehow it is guaranteed that, say,
kvm_clear_async_pf_completion_queue() must be always called before the
caller of kvm_setup_async_pf() can exit? I don't know, but in this case
we do not need any accounting and this should be documented.

Gleb, what do you think?

Oleg.

--- x/virt/kvm/async_pf.c
+++ x/virt/kvm/async_pf.c
@@ -65,11 +65,9 @@ static void async_pf_execute(struct work_struct *work)

might_sleep();

- use_mm(mm);
down_read(&mm->mmap_sem);
get_user_pages(current, mm, addr, 1, 1, 0, NULL, NULL);
up_read(&mm->mmap_sem);
- unuse_mm(mm);

spin_lock(&vcpu->async_pf.lock);
list_add_tail(&apf->link, &vcpu->async_pf.done);
@@ -85,7 +83,7 @@ static void async_pf_execute(struct work_struct *work)
if (waitqueue_active(&vcpu->wq))
wake_up_interruptible(&vcpu->wq);

- mmdrop(mm);
+ mmput(mm);
kvm_put_kvm(vcpu->kvm);
}

@@ -98,7 +96,7 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
typeof(*work), queue);
list_del(&work->queue);
if (cancel_work_sync(&work->work)) {
- mmdrop(work->mm);
+ mmput(work->mm);
kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
kmem_cache_free(async_pf_cache, work);
}
@@ -162,7 +160,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
work->addr = gfn_to_hva(vcpu->kvm, gfn);
work->arch = *arch;
work->mm = current->mm;
- atomic_inc(&work->mm->mm_count);
+ atomic_inc(&work->mm->mm_users);
kvm_get_kvm(work->vcpu->kvm);

/* this can't really happen otherwise gfn_to_pfn_async
@@ -180,7 +178,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
return 1;
retry_sync:
kvm_put_kvm(work->vcpu->kvm);
- mmdrop(work->mm);
+ mmput(work->mm);
kmem_cache_free(async_pf_cache, work);
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/