Re: [RFC][PATCH 0/3] PM / sleep: Avoid resuming runtime-suspended devices during system suspend

From: Rafael J. Wysocki
Date: Tue May 13 2014 - 11:08:45 EST


On Tuesday, May 13, 2014 10:45:38 AM Alan Stern wrote:
> On Tue, 13 May 2014, Rafael J. Wysocki wrote:
>
> > Hi All,
> >
> > We've discussed that at length here:
> >
> > http://marc.info/?t=139950469000003&r=1&w=4
> >
> > but I'm starting a new thread to refresh things a bit.
> >
> > This is about adding a mechanism allowing us to avoid runtime-suspended
> > devices during system suspend. The reason why it has to touch the PM core
> > is because that needs to be coordinated across the device hierarchy.
> >
> > The idea is to add a new device PM flag and to modify the PM core as follows.
> >
> > - If ->prepare() returns a positive number for a device, that means "this
> > device is runtime-suspended and you can leave it like that if you do the
> > same for all of its descendants".
> >
> > - If that happens, the PM core sets the new flag for the device in
> > question *if* the device is indeed runtime-suspended *and* *if*
> > the transition is a suspend (and not hibernation, for example).
> > Otherwise, it clears the flag for the device. All of that happens in
> > device_prepare().
> >
> > - In __device_suspend() the PM core clears the new flag for the device's
> > parent if it is clear for the device to ensure that the flag will only
> > be set for a device if it is also set for all of its descendants.
>
> There's nothing to prevent a runtime-suspended device from being
> resumed in between the ->prepare() and ->suspend() callbacks.

I'm moving the barrier from __device_suspend() to device_prepare(), so there
shouldn't be surprise resumes in that time frame.

> (Ulf mentioned this too.)

Ulf was talking about pm_wakeup_pending(), which is tangentially related.

> Therefore it makes little sense to check the device's runtime status in
> device_prepare(). The check should be done in __device_suspend().

If we do the barrier in device_prepare(), then I'm not sure what mechanism
would cause the device to resume.

If there is one, the whole approach is in danger, because ->prepare() has to
check if devices are runtime-suspended and has to be sure that their status
won't change after it has returned 1.

> > - PM core skips ->suspend/late/noirq and ->resume/early/noirq for all devices
> > having the flag set - so the flag can be called "direct_complete" as it
> > causes the PM core to go directy for the ->complete() callback when set.
> >
> > - The ->complete() callback has to check direct_complete if ->prepare()
> > returned a positive number previously and is responsible for further
> > handling of the device.
> >
> > That is introduced by patch [2/3].
> >
> > To simplify things slightly it is helpful to move the invocation of
> > pm_runtime_barrier() from __device_suspend() to device_prepare(), but still
> > under pm_runtime_get_noresume() beforehand (patch [1/3]).
>
> If the check is moved to __device_suspend() then the barrier can remain
> where it is now.

The check also needs to be done in ->prepare().

> > Patch [3/3] shows how this can be used by adding support for it to the ACPI
> > PM comain.
> >
> > Thanks!
>
> Aside from this one matter, everything seems pretty good.

Well, that's a quite a big issue.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/