Re: [PATCH 0/2] pci/iov: avoid device_lock() when reading sriov_numvfs

From: Bjorn Helgaas
Date: Thu Feb 08 2024 - 19:30:10 EST


[+cc Pierre, author of 35ff867b7657 ("PCI/IOV: Serialize sysfs
sriov_numvfs reads vs writes")]

On Wed, Dec 20, 2023 at 10:58:12PM +0000, Jim Harris wrote:
> If SR-IOV enabled device is held by vfio, and device is removed,
> vfio will hold device lock and notify userspace of the removal. If
> userspace reads sriov_numvfs sysfs entry, that thread will be
> blocked since sriov_numvfs_show() also tries to acquire the device
> lock. If that same thread is responsible for releasing the device to
> vfio, it results in a deadlock.
>
> One patch was proposed to add a separate mutex, specifically for
> struct pci_sriov, to synchronize access to sriov_numvfs in the sysfs
> paths (replacing use of the device_lock()). Leon instead suggested
> just reverting the commit 35ff867b765 which introduced device_lock()
> in the store path. This also led to a small fix around ordering on
> the kobject_uevent() when sriov_numvfs is updated.
>
> Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/

1) Cc author of the commit being reverted (Pierre) so he has a chance
to chime in and make sure the proposed fix works for him as well.

2) The revert commit log needs to justify the revert, not merely say
what the proper way is. The Ref: above suggests that the current code
(pre-revert) leads to a deadlock in some cases, so the revert commit
log should detail that.

It's ideal if we never regress, not even between the revert and the
second patch, so it's possible that they should be squashed into a
single patch. But if you keep it as two patches, it's trivial for me
to squash them if we decide that's best.

3) Follow subject line convention for drivers/pci (use "git log
--oneline drivers/pci" to learn it).

I did 1) here and could do 3) for you, but it would be better if you
could update and repost the series with 2) updated.

In the meantime you may notice that I pushed these on a
pci/virtualization just to get the 0-day bot to build test it. I
propose to replace that branch with an updated series, since the code
changes themselves probably will stay the same.

> ---
>
> Jim Harris (2):
> Revert "PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes"
> pci/iov: fix kobject_uevent() ordering in sriov_enable()
>
>
> drivers/pci/iov.c | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> --