Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface

From: Alex Williamson
Date: Fri Aug 11 2023 - 15:57:47 EST


On Fri, 11 Aug 2023 14:25:43 -0500
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> > Unlike default access to config space through sysfs, the vpd read and
> > write function don't actively manage the runtime power management state
> > of the device during access. Since commit 7ab5e10eda02 ("vfio/pci: Move
> > the unused device into low power state with runtime PM"), the vfio-pci
> > driver will use runtime power management and release unused devices to
> > make use of low power states. Attempting to access VPD information in
> > this low power state can result in incorrect information or kernel
> > crashes depending on the system behavior.
>
> I guess this specifically refers to D3cold, right? VPD is accessed
> via config space, which should work in all D0-D3hot states, but not in
> D3cold. I don't see anything in the spec about needing to be in D0 to
> access VPD.
>
> I assume there's no public problem report we could cite here? I
> suppose the behavior in D3cold is however the system handles a UR
> error.

Yes, Eric tested that pcie_port_pm=off is a viable workaround resolving
both the VPD and PME accesses, so I think the issue is actually that
the root port is in D3cold. This aligns with commit 7ab5e10eda02
above, since prior to that we were only manipulating the endpoint power
state.

The oops indicates an "Internal error: synchronous external abort",
with a stack trace ending in pci_generic_config_read, so I suspect this
is a UR.

Unfortunately the bz is not currently public :-\ Thanks,

Alex

> > Wrap the vpd read/write bin attribute handlers in runtime PM and take
> > into account the potential quirk to select the correct device to wake.
> >
> > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > ---
> > drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
> > 1 file changed, 32 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> > index a4fc4d0690fe..81217dd4789f 100644
> > --- a/drivers/pci/vpd.c
> > +++ b/drivers/pci/vpd.c
> > @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
> > size_t count)
> > {
> > struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> > + struct pci_dev *vpd_dev = dev;
> > + ssize_t ret;
> > +
> > + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> > + vpd_dev = pci_get_func0_dev(dev);
> > + if (!vpd_dev)
> > + return -ENODEV;
> > + }
> > +
> > + pci_config_pm_runtime_get(vpd_dev);
> > + ret = pci_read_vpd(vpd_dev, off, count, buf);
> > + pci_config_pm_runtime_put(vpd_dev);
> > +
> > + if (dev != vpd_dev)
> > + pci_dev_put(vpd_dev);
> >
> > - return pci_read_vpd(dev, off, count, buf);
> > + return ret;
> > }
> >
> > static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> > @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> > size_t count)
> > {
> > struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> > + struct pci_dev *vpd_dev = dev;
> > + ssize_t ret;
> > +
> > + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> > + vpd_dev = pci_get_func0_dev(dev);
> > + if (!vpd_dev)
> > + return -ENODEV;
> > + }
> > +
> > + pci_config_pm_runtime_get(vpd_dev);
> > + ret = pci_write_vpd(vpd_dev, off, count, buf);
> > + pci_config_pm_runtime_put(vpd_dev);
> > +
> > + if (dev != vpd_dev)
> > + pci_dev_put(vpd_dev);
> >
> > - return pci_write_vpd(dev, off, count, buf);
> > + return ret;
> > }
> > static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
> >
> > --
> > 2.40.1
> >
>