Re: [PATCH v2] pci: fix device presence detection for VFs

From: Bjorn Helgaas
Date: Wed Nov 09 2022 - 12:30:38 EST


On Wed, Nov 09, 2022 at 02:10:30AM -0500, Michael S. Tsirkin wrote:
> On Tue, Nov 08, 2022 at 11:12:34PM -0600, Bjorn Helgaas wrote:
> > On Wed, Nov 09, 2022 at 04:36:17AM +0000, Wei Gong wrote:
> > > O Tue, Nov 08, 2022 at 01:02:35PM -0500, Michael S. Tsirkin wrote:
> > > > On Tue, Nov 08, 2022 at 11:58:53AM -0600, Bjorn Helgaas wrote:
> > > > > On Tue, Nov 08, 2022 at 10:19:07AM -0500, Michael S. Tsirkin wrote:
> > > > > > On Tue, Nov 08, 2022 at 09:02:28AM -0600, Bjorn Helgaas wrote:
> > > > > > > On Tue, Nov 08, 2022 at 08:53:00AM -0600, Bjorn Helgaas wrote:
> > > > > > > > On Wed, Oct 26, 2022 at 02:11:21AM -0400, Michael S. Tsirkin wrote:
> > > > > > > > > virtio uses the same driver for VFs and PFs.
> > > > > > > > > Accordingly, pci_device_is_present is used to detect
> > > > > > > > > device presence. This function isn't currently working
> > > > > > > > > properly for VFs since it attempts reading device and
> > > > > > > > > vendor ID.
> > > > > > > >
> > > > > > > > > As VFs are present if and only if PF is present,
> > > > > > > > > just return the value for that device.
> > > > > > > >
> > > > > > > > VFs are only present when the PF is present *and* the PF
> > > > > > > > has VF Enable set. Do you care about the possibility that
> > > > > > > > VF Enable has been cleared?
> > > > >
> > > > > I think you missed this question.
> > > >
> > > > I was hoping Wei will answer that, I don't have the hardware.
> > >
> > > In my case I don't care that VF Enable has been cleared.
> >
> > OK, let me rephrase that :)
> >
> > I think pci_device_is_present(VF) should return "false" if the PF is
> > present but VFs are disabled.
> >
> > If you think it should return "true" when the PF is present and VFs
> > are disabled, we should explain why.
> >
> > We would also need to fix the commit log, because "VFs are present if
> > and only if PF is present" is not actually true. "VFs are present
> > only if PF is present" is true, but "VFs are present if PF is present"
> > is not.
>
> Bjorn, I don't really understand the question.
>
> How does one get a vf pointer without enabling sriov?
> They are only created by sriov_add_vfs after calling
> pcibios_sriov_enable.

Oh, I think I see where you're coming from. The fact that we have a
VF pointer means VFs were enabled in the past, and as long as the PF
is still present, the VFs should still be enabled.

Since the continued existence of the VF device depends on VF Enable, I
guess my question is whether we need to worry about VF Enable being
cleared, e.g., via sysfs reset or a buggy PF driver.

Taking a step back, I don't understand the
"if (!pci_device_is_present()) virtio_break_device()" strategy because
checking for device presence is always unreliable. I assume the
consumer of vq->broken, e.g., virtnet_send_command(), would see a
failed PCI read that probably returns ~0 data. Could it not check for
that and then figure out whether that's valid data or an error
indication?

It looks like today, virtnet_send_command() might sit in that "while"
loop calling virtqueue_get_buf() repeatedly until virtio_pci_remove()
notices the device is gone and marks it broken. Something must be
failing in virtqueue_get_buf() in that interval between the device
disappearing and virtio_pci_remove() noticing it.

Bjorn