Re: [PATCH] vfio/pci: take mmap write lock for io_remap_pfn_range

From: Yan Zhao
Date: Thu May 11 2023 - 03:22:41 EST


On Wed, May 10, 2023 at 05:41:06PM -0300, Jason Gunthorpe wrote:
> On Mon, May 08, 2023 at 02:57:15PM -0600, Alex Williamson wrote:
>
> > We already try to set the flags in advance, but there are some
> > architectural flags like VM_PAT that make that tricky. Cedric has been
> > looking at inserting individual pages with vmf_insert_pfn(), but that
> > incurs a lot more faults and therefore latency vs remapping the entire
> > vma on fault. I'm not convinced that we shouldn't just attempt to
> > remove the fault handler entirely, but I haven't tried it yet to know
> > what gotchas are down that path. Thanks,
>
> I thought we did it like this because there were races otherwise with
> PTE insertion and zapping? I don't remember well anymore.
>
> I vaugely remember the address_space conversion might help remove the
> fault handler?
>
What about calling vmf_insert_pfn() in bulk as below?
And what is address_space conversion?


diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index a5ab416cf476..1476e537f593 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1686,6 +1686,7 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
struct vfio_pci_core_device *vdev = vma->vm_private_data;
struct vfio_pci_mmap_vma *mmap_vma;
vm_fault_t ret = VM_FAULT_NOPAGE;
+ unsigned long base_pfn, offset, i;

mutex_lock(&vdev->vma_lock);
down_read(&vdev->memory_lock);
@@ -1710,12 +1711,15 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
goto up_out;
}

- if (io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
- vma->vm_end - vma->vm_start,
- vma->vm_page_prot)) {
- ret = VM_FAULT_SIGBUS;
- zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
- goto up_out;
+ base_pfn = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
+ base_pfn += vma->vm_pgoff;
+ for (i = vma->vm_start; i < vma->vm_end; i += PAGE_SIZE) {
+ offset = (i - vma->vm_start) >> PAGE_SHIFT;
+ ret = vmf_insert_pfn(vma, i, base_pfn + offset);
+ if (ret != VM_FAULT_NOPAGE) {
+ zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start);
+ goto up_out;
+ }
}

if (__vfio_pci_add_vma(vdev, vma)) {