Re: [PATCH v7 4/4] vfio: convey kvm that the vfio-pci device is wc safe

From: Jason Gunthorpe
Date: Mon Feb 12 2024 - 12:20:25 EST


On Mon, Feb 12, 2024 at 10:05:02AM -0700, Alex Williamson wrote:

> > --- a/drivers/vfio/pci/vfio_pci_core.c
> > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > @@ -1862,8 +1862,12 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma
> > /*
> > * See remap_pfn_range(), called from vfio_pci_fault() but we can't
> > * change vm_flags within the fault handler. Set them now.
> > + *
> > + * Set an additional flag VM_ALLOW_ANY_UNCACHED to convey kvm that
> > + * the device is wc safe.
> > */
>
> That's a pretty superficial comment. Check that this is accurate, but
> maybe something like:
>
> The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64,
> allowing stage 2 device mapping attributes to use Normal-NC
^^^^

> rather than DEVICE_nGnRE, which allows guest mappings
> supporting combining attributes (WC). This attribute has
> potential risks with the GICv2 VCPU interface, but is expected
> to be safe for vfio-pci use cases.

Sure, if you want to elaborate more

The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64,
allowing KVM stage 2 device mapping attributes to use Normal-NC
rather than DEVICE_nGnRE, which allows guest mappings
supporting combining attributes (WC). ARM does not architecturally
guarentee this is safe, and indeed some MMIO regions like the GICv2
VCPU interface can trigger uncontained faults if Normal-NC is used.

Even worse we expect there are platforms where even DEVICE_nGnRE can
allow uncontained faults in conercases. Unfortunately existing ARM
IP requires platform integration to take responsibility to prevent
this.

To safely use VFIO in KVM the platform must guarantee full safety
in the guest where no action taken against a MMIO mapping can
trigger an uncontainer failure. We belive that most VFIO PCI
platforms support this for both mapping types, at least in common
flows, based on some expectations of how PCI IP is integrated. This
can be enabled more broadly, for instance into vfio-platform
drivers, but only after the platform vendor completes auditing for
safety.

> And specifically, I think these other devices that may be problematic
> as described in the cover letter is a warning against use for
> vfio-platform, is that correct?

Maybe more like "we have a general consensus that vfio-pci is likely
safe due to how PCI IP is typically integrated, but it is much less
obvious for other VFIO bus types. As there is no known WC user for
vfio-platform drivers be conservative and do not enable it."

Jason