[PATCH v3 0/9] vfio: Fix release ordering races and use driver_override

From: Alex Williamson
Date: Tue Jun 20 2017 - 11:47:42 EST


v3:

* Fix Alexey's nit in 2/, which becomes a bug in 3/. I posted the
intended correction for this, but 0-day builds broke on it and I'd
like to be sure we get all the automated testing possible, so v3.
Added Alexey's Rb.

Thanks,

Alex

v2:

* Added received acks and reviews, thanks!
* Rebased and resolved conflict in patch 2/, dropped reviews due
to changes and added Alexey to cc as spapr code is moved too
* Added stable tag for patches 1-3
* Resolved comment typo Eric noted in patch 1
* Split AMBA out to patches 8 & 9 as Eric noted amba_bustype is
not exported. These can be separate follow-up patches if delayed

Please re-ack/review patch 2. Eric, I'm happy to add your Tested-by
to the whole series if appropriate as well. Thanks,

Alex


v1:

VM hotplug testing reveals a number of races in the vfio device,
group, container shutdown path, some attributed to libvirt's ask/take
unplug behavior and some long standing with groups potentially
composed of multiple devices, where each device can be independently
bound to drivers. Libvirt's ask/take behavior is a result of the
asynchronous nature of PCI hotplug, libvirt registers a hot-unplug
request (ask), which is acknowledged almost immediately and then
proceeds to try to unbind the device from the vfio bus driver (take).
This sets us off on racing paths where we allow the device to be
released from the group much like would happen in groups with multiple
devices, while the group and container are torn down separately.
These races are addressed in the first 3 patches of this series.

The long standing issue with removing devices from in-use groups is
that we feel that the system is compromised if we allow user and host
devices within the same non-isolated group. This triggers a BUG_ON
when we detect this condition after the rogue driver binding. Since
that code was put in place we've added driver_override support for
all of the physical buses supported by vfio, giving us a way to block
binding to such compromising drivers. We finally enable that in the
latter 4 patches of this series, minding that we need to allow
re-binding to non-compromising drivers, and also noting that a small
synchronization stall is effective in eliminating the need for this
blocking in the more common singleton device group case.

Reviews, comments, and acks appreciated. Thanks,

Alex

---

Alex Williamson (9):
vfio: Fix group release deadlock
kvm-vfio: Decouple only when we match a group
vfio: New external user group/file match
iommu: Add driver-not-bound notification
vfio: Create interface for vfio bus drivers to register
vfio: Register pci, platform, amba, and mdev bus drivers
vfio: Use driver_override to avert binding to compromising drivers
amba: Export amba_bustype
vfio: Add AMBA driver_override support


drivers/amba/bus.c | 1
drivers/iommu/iommu.c | 3
drivers/vfio/mdev/vfio_mdev.c | 13 ++
drivers/vfio/pci/vfio_pci.c | 7 +
drivers/vfio/platform/vfio_amba.c | 24 +++
drivers/vfio/platform/vfio_platform.c | 24 +++
drivers/vfio/vfio.c | 252 ++++++++++++++++++++++++++++++++-
include/linux/iommu.h | 1
include/linux/vfio.h | 5 +
virt/kvm/vfio.c | 40 +++--
10 files changed, 344 insertions(+), 26 deletions(-)