Re: [PATCH 1/2] PM-runtime: Take suppliers into account in __pm_runtime_set_status()

From: Rafael J. Wysocki
Date: Mon Feb 11 2019 - 17:41:58 EST


On Mon, Feb 11, 2019 at 2:28 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Thu, 7 Feb 2019 at 19:46, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > If the target device has any suppliers, as reflected by device links
> > to them, __pm_runtime_set_status() does not take them into account,
> > which is not consistent with the other parts of the PM-runtime
> > framework and may lead to programming mistakes.
> >
> > Modify __pm_runtime_set_status() to take suppliers into account by
> > activating them upfront if the new status is RPM_ACTIVE and
> > deactivating them on exit if the new status is RPM_SUSPENDED.
> >
> > If the activation of one of the suppliers fails, the new status
> > will be RPM_SUSPENDED and the (remaining) suppliers will be
> > deactivated on exit (the child count of the device's parent
> > will be dropped too then).
> >
> > Of course, adding device links locking to __pm_runtime_set_status()
> > means that it cannot be run fron interrupt context, so make it use
> > spin_lock_irq() and spin_unlock_irq() instead of spin_lock_irqsave()
> > and spin_unlock_irqrestore(), respectively.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Rafael, thanks for working on this!
>
> I am running some tests at my side, but still not achieving the
> behavior I expect to. Will let you know when I have more details, but
> first some comments below.
>
> > ---
> > drivers/base/power/runtime.c | 45 ++++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 40 insertions(+), 5 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -1102,20 +1102,43 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_u
> > * and the device parent's counter of unsuspended children is modified to
> > * reflect the new status. If the new status is RPM_SUSPENDED, an idle
> > * notification request for the parent is submitted.
> > + *
> > + * If @dev has any suppliers (as reflected by device links to them), and @status
> > + * is RPM_ACTIVE, they will be activated upfront and if the activation of one
> > + * of them fails, the status of @dev will be changed to RPM_SUSPENDED (instead
> > + * of the @status value) and the suppliers will be deacticated on exit. The
> > + * error returned by the failing supplier activation will be returned in that
> > + * case.
> > */
> > int __pm_runtime_set_status(struct device *dev, unsigned int status)
> > {
> > struct device *parent = dev->parent;
> > - unsigned long flags;
> > bool notify_parent = false;
> > int error = 0;
> >
> > if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > return -EINVAL;
> >
> > - spin_lock_irqsave(&dev->power.lock, flags);
> > + /*
> > + * If the new status is RPM_ACTIVE, the suppliers can be activated
> > + * upfront regardless of the current status, because next time
> > + * rpm_put_suppliers() runs, the rpm_active refcounts of the links
> > + * involved will be dropped down to one anyway.
> > + */
> > + if (status == RPM_ACTIVE) {
> > + int idx = device_links_read_lock();
> > +
> > + error = rpm_get_suppliers(dev);
> > + if (error)
> > + status = RPM_SUSPENDED;
> > +
> > + device_links_read_unlock(idx);
> > + }
>
> This doesn't look right to me, and more importantly, this isn't
> consistent with how we treat a parent/child.

It cannot be entirely consistent with that, because you cannot walk
the suppliers under the device's power.lock.

The idea here is that activating suppliers upfront if the new status
is RPM_ACTIVE shouldn't hurt regardless.

> More precisely, I think you need to check "if
> (!dev->power.runtime_error && !dev->power.disable_depth)" and also
> whether "dev->power.runtime_status == status", before deciding to call
> rpm_get_suppliers() above. Otherwise you may end up resuming suppliers
> and/or increasing the link->rpm_active count, when you shouldn't.

Resuming suppliers unnecessarily is not particularly efficient, but it
is not incorrect. Incrementing their rpm_active temporarily also
isn't incorrect as long as the rpm_active values are correct on exit
(and note that incementing them if the consumer's status is RPM_ACTIVE
doesn't even matter).

> In other words, expecting __pm_runtime_set_status() to be called in
> "balanced" manner isn't correct.

There is no such expectation here.

There is a possible race between __pm_runtime_set_status() and runtime
suspend or resume of the device in case PM-runtime is enabled for it
when __pm_runtime_set_status() is called, but it shouldn't occur if
__pm_runtime_set_status() is used correctly (that is, when PM-runtime
is disabled for the device).

I think I know how to avoid that race, though, so I'm going to post an
incremental fix if that works out.