Re: [PATCH] vfio/pci: take mmap write lock for io_remap_pfn_range

From: Cédric Le Goater
Date: Thu May 11 2023 - 03:39:47 EST


On 5/11/23 08:56, Yan Zhao wrote:
On Wed, May 10, 2023 at 05:41:06PM -0300, Jason Gunthorpe wrote:
On Mon, May 08, 2023 at 02:57:15PM -0600, Alex Williamson wrote:

We already try to set the flags in advance, but there are some
architectural flags like VM_PAT that make that tricky. Cedric has been
looking at inserting individual pages with vmf_insert_pfn(), but that
incurs a lot more faults and therefore latency vs remapping the entire
vma on fault. I'm not convinced that we shouldn't just attempt to
remove the fault handler entirely, but I haven't tried it yet to know
what gotchas are down that path. Thanks,

I thought we did it like this because there were races otherwise with
PTE insertion and zapping? I don't remember well anymore.

I vaugely remember the address_space conversion might help remove the
fault handler?

What about calling vmf_insert_pfn() in bulk as below?

This works too, it is slightly slower than the io_remap_pfn_range() call
but doesn't have the lockdep issues.

Thanks,

C.

And what is address_space conversion?


diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a5ab416cf476..1476e537f593 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1686,6 +1686,7 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
struct vfio_pci_core_device *vdev = vma->vm_private_data;
struct vfio_pci_mmap_vma *mmap_vma;
vm_fault_t ret = VM_FAULT_NOPAGE;
+ unsigned long base_pfn, offset, i;

mutex_lock(&vdev->vma_lock);
down_read(&vdev->memory_lock);
@@ -1710,12 +1711,15 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
goto up_out;
}

- if (io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot)) {
- ret = VM_FAULT_SIGBUS;
- zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
- goto up_out;
+ base_pfn = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
+ base_pfn += vma->vm_pgoff;
+ for (i = vma->vm_start; i < vma->vm_end; i += PAGE_SIZE) {
+ offset = (i - vma->vm_start) >> PAGE_SHIFT;
+ ret = vmf_insert_pfn(vma, i, base_pfn + offset);
+ if (ret != VM_FAULT_NOPAGE) {
+ zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
+ goto up_out;
+ }
}

if (__vfio_pci_add_vma(vdev, vma)) {