Re: [PATCH] PCI: Add sysfs attribute for PCI device power state

From: Bjorn Helgaas
Date: Sun Nov 15 2020 - 15:27:24 EST


On Sun, Nov 15, 2020 at 01:45:18PM +0100, Maximilian Luz wrote:
> On 11/15/20 7:08 AM, Krzysztof Wilczyński wrote:
> > On 20-11-02 15:15:20, Maximilian Luz wrote:
> > > While most PCI power-states can be queried from user-space via lspci,
> > > this has some limits. Specifically, lspci fails to provide an accurate
> > > value when the device is in D3cold as it has to resume the device before
> > > it can access its power state via the configuration space, leading to it
> > > reporting D0 or another on-state. Thus lspci can, for example, not be
> > > used to diagnose power-consumption issues for devices that can enter
> > > D3cold or to ensure that devices properly enter D3cold at all.
> > >
> > > To alleviate this issue, introduce a new sysfs device attribute for the
> > > PCI power state, showing the current power state as seen by the kernel.
> >
> > Very nice! Thank you for adding this.
> >
> > [...]
> > > +/* PCI power state */
> > > +static ssize_t power_state_show(struct device *dev,
> > > + struct device_attribute *attr, char *buf)
> > > +{
> > > + struct pci_dev *pci_dev = to_pci_dev(dev);
> > > + pci_power_t state = READ_ONCE(pci_dev->current_state);
> > > +
> > > + return sprintf(buf, "%s\n", pci_power_name(state));
> > > +}
> > > +static DEVICE_ATTR_RO(power_state);
> > [...]
> >
> > Curious, why did you decide to use the READ_ONCE() macro here? Some
> > other drivers exposing data through sysfs use, but certainly not all.
>
> As far as I can tell current_state is normally guarded by the device
> lock, but here we don't hold that lock. I generally prefer to be
> explicit about those kinds of access, if only to document that the value
> can change here.
>
> In this case it should work fine without it, but this also has the
> benefit that if someone were to add a change like
>
> if (state > x)
> state = y;
>
> later on (here or even in pci_power_name() due to inlining), things
> wouldn't break as we explicitly forbid the compiler to load
> current_state more than once. Without the READ_ONCE, the compiler would
> be theoretically allowed to do two separate reads then (although
> arguably unlikely it would end up doing that).
>
> Also there's no downside of marking it as READ_ONCE here as far as I can
> tell, as in that context the value will always have to be loaded from
> memory.

I think something read from sysfs is a snapshot with no guarantee
about how long it will remain valid, so I don't see a problem with the
value being stale by the time userspace consumes it.

If there's a downside to doing two separate reads, we could mention
that in the commit log or a comment.

If there's not a specific reason for using READ_ONCE(), I think we
should omit it because using it in one place but not others suggests
that there's something special about this place.

Bjorn