Re: [RFC PATCH] kvm: Use huge pages for DAX-backed files

From: Paolo Bonzini
Date: Wed Oct 31 2018 - 04:50:03 EST


On 30/10/2018 20:45, Barret Rhoden wrote:
> On 2018-10-29 at 20:10 Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> The property of DAX pages that requires special coordination is the
>> fact that the device hosting the pages can be disabled at will. The
>> get_dev_pagemap() api is the interface to pin a device-pfn so that you
>> can safely perform a pfn_to_page() operation.
>>
>> Have the pages that kvm uses in this path already been pinned by vfio?

No, VFIO is not involved here.

The pages that KVM uses are never pinned. Soon after we grab them and
we build KVM's page table, we do put_page in mmu_set_spte (via
kvm_release_pfn_clean). From that point on the MMU notifier will take
care of invalidating SPT before the page disappears from the mm's page
table.

> One usage of kvm_is_reserved_pfn() in KVM code is like this:
>
> static struct page *kvm_pfn_to_page(kvm_pfn_t pfn)
> {
> if (is_error_noslot_pfn(pfn))
> return KVM_ERR_PTR_BAD_PAGE;
>
> if (kvm_is_reserved_pfn(pfn)) {
> WARN_ON(1);
> return KVM_ERR_PTR_BAD_PAGE;
> }
>
> return pfn_to_page(pfn);
> }
>
> I think there's no guarantee the kvm->mmu_lock is held in the generic
> case.

Indeed, it's not.

> There are probably other rules related to gfn_to_page that keep the
> page alive, maybe just during interrupt/vmexit context? Whatever keeps
> those pages alive for normal memory might grab that devmap reference
> under the hood for DAX mappings.

Nothing keeps the page alive except for the MMU notifier (which of
course cannot run in atomic context, since its callers take the mmap_sem).

Paolo