Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface

From: Bjorn Helgaas
Date: Thu Aug 10 2023 - 11:59:31 EST


On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> Unlike default access to config space through sysfs, the vpd read and
> write function don't actively manage the runtime power management state
> of the device during access. Since commit 7ab5e10eda02 ("vfio/pci: Move
> the unused device into low power state with runtime PM"), the vfio-pci
> driver will use runtime power management and release unused devices to
> make use of low power states. Attempting to access VPD information in
> this low power state can result in incorrect information or kernel
> crashes depending on the system behavior.
>
> Wrap the vpd read/write bin attribute handlers in runtime PM and take
> into account the potential quirk to select the correct device to wake.
>
> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> ---
> drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> index a4fc4d0690fe..81217dd4789f 100644
> --- a/drivers/pci/vpd.c
> +++ b/drivers/pci/vpd.c
> @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
> size_t count)
> {
> struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> + struct pci_dev *vpd_dev = dev;
> + ssize_t ret;
> +
> + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> + vpd_dev = pci_get_func0_dev(dev);
> + if (!vpd_dev)
> + return -ENODEV;
> + }
> +
> + pci_config_pm_runtime_get(vpd_dev);
> + ret = pci_read_vpd(vpd_dev, off, count, buf);
> + pci_config_pm_runtime_put(vpd_dev);
> +
> + if (dev != vpd_dev)
> + pci_dev_put(vpd_dev);

I first thought this would leak a reference if dev was func0 and had
PCI_DEV_FLAGS_VPD_REF_F0 set, because in that case vpd_dev would be
the same as dev.

But I think that case can't happen because quirk_f0_vpd_link() does
nothing for func0 devices, so PCI_DEV_FLAGS_VPD_REF_F0 should never be
set for func0. But it seems like this might be easier to analyze as:

if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0)
pci_dev_put(vpd_dev);

Or am I missing something?

> - return pci_read_vpd(dev, off, count, buf);
> + return ret;
> }
>
> static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> size_t count)
> {
> struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> + struct pci_dev *vpd_dev = dev;
> + ssize_t ret;
> +
> + if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> + vpd_dev = pci_get_func0_dev(dev);
> + if (!vpd_dev)
> + return -ENODEV;
> + }
> +
> + pci_config_pm_runtime_get(vpd_dev);
> + ret = pci_write_vpd(vpd_dev, off, count, buf);
> + pci_config_pm_runtime_put(vpd_dev);
> +
> + if (dev != vpd_dev)
> + pci_dev_put(vpd_dev);
>
> - return pci_write_vpd(dev, off, count, buf);
> + return ret;
> }
> static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
>
> --
> 2.40.1
>