Re: [PATCH 2/2] PM: s2idle: Fully prevent the system from entering s2idle when cpuidle isn't supported

From: Kazuki H
Date: Tue Jul 11 2023 - 14:38:54 EST


On Tue, Jul 11, 2023 at 07:55:46PM +0200, Rafael J. Wysocki wrote:
> On Tue, Jul 11, 2023 at 7:54 AM Kazuki Hashimoto <kazukih0205@xxxxxxxxx> wrote:
> >
> > In order for systems to properly enter s2idle, we need functions both in
> > the idle subsystem (such as call_cpuidle_s2idle()) and the suspend subsystem
> > to be executed.
> >
> > s2idle got blocked in the idle subsystem on platforms without cpuidle after
> > commit ef2b22ac540c ("cpuidle / sleep: Use broadcast timer for states that stop
> > local timer").
>
> What do you mean by "blocked in the idle subsystem"?
There is a check in kernel/sched/idle.c which determines whether cpuidle
is enabled. If that isn't the case, functions necessary for s2idle don't
get executed. Here's a snippet of the code:

if (cpuidle_not_available(drv, dev)) {
tick_nohz_idle_stop_tick();

default_idle_call();
goto exit_idle;
}

/*
* Suspend-to-idle ("s2idle") is a system state in which all user space
* has been frozen, all I/O devices have been suspended and the only
* activity happens here and in interrupts (if any). In that case bypass
* the cpuidle governor and go straight for the deepest idle state
* available. Possibly also suspend the local tick and the entire
* timekeeping to prevent timer interrupts from kicking us out of idle
* until a proper wakeup interrupt happens.
*/

if (idle_should_enter_s2idle() || dev->forced_idle_latency_limit_ns) {
u64 max_latency_ns;

if (idle_should_enter_s2idle()) {

entered_state = call_cpuidle_s2idle(drv, dev);
if (entered_state > 0)
goto exit_idle;

max_latency_ns = U64_MAX;
} else {
max_latency_ns = dev->forced_idle_latency_limit_ns;
}

tick_nohz_idle_stop_tick();

next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
call_cpuidle(drv, dev, next_state);
} else {
bool stop_tick = true;

/*
* Ask the cpuidle framework to choose a convenient idle state.
*/
next_state = cpuidle_select(drv, dev, &stop_tick);

if (stop_tick || tick_nohz_tick_stopped())
tick_nohz_idle_stop_tick();
else
tick_nohz_idle_retain_tick();

entered_state = call_cpuidle(drv, dev, next_state);
/*
* Give the governor an opportunity to reflect on the outcome
*/
cpuidle_reflect(dev, entered_state);
}

exit_idle:
__current_set_polling();

/*
* It is up to the idle functions to reenable local interrupts
*/
if (WARN_ON_ONCE(irqs_disabled()))
local_irq_enable();
>
> > However, the suspend subsystem doesn't have this, which can cause
> > the suspend subsystem to begin entering s2idle behind the idle subsystem's back,
>
> What do you mean by this?
The suspend subsystem doesn't have the check which determines whether
cpuidle is enabled or not. Therefore, the suspend subsystem can put the
system into s2idle even though functions necessary for s2idle in the
idle subsystem hasn't been executed.
>
> > which in turn can cause the system to enter s2idle even though all the functions
> > necessary for s2idle hasn't been executed, breaking the system
> > (e.g. ClOCK_MONOTONIC keeps ticking during suspend even though it's not supposed
> > to).
>
> Why is this a problem?
There are programs such as systemd, which depend on CLOCK_MONOTONIC
being paused during suspend as outlined here:

> > It increases by the slept time (1min + some seconds required to suspend/wakeup).
> Well, it's really not supposed to. The monotonic clock (CLOCK_MONOTONIC) is supposed
> to pause while the system is suspended. If it continues running then what you are
> seeing is kinda expected, because nothing will be scheduled while the system is
> suspended.
>
> The python test I gave you is entirely independent from systemd, this means this is
> a bug within your kernel, and your kernel only. Please report this to your distro's
> kernel packaging team, there's nothing we can do about this. CLOCK_MONOTONIC is
> supposed to pause during suspend (and CLOCK_BOOTTIME is supposed to continue), and
> if this doesn't work then this is something that has to be fixed in the kernel.

> (Some pre-release kernels carried some patches that broke CLOCK_MONOTONIC and made
> it work like CLOCK_BOOTTIME. They got reverted later on, and shouldn't have reached
> anybody's systems. Otherwise what you are seeing does smell a lot like those patches.)
>
> Anyway, closing this here, as there's nothing we can do about this in systemd, and
> the bug is in your kernel.
https://github.com/systemd/systemd/issues/9538#issuecomment-405590102

>
> > Prevent the system from entering s2idle when cpuidle isn't supported in the
> > suspend subsystem as well.
>
> I'm sure that there's a real problem you're trying to address, but I
> cannot help you without understanding what the problem is.
>
> So please explain what exactly is going on, what is expected to happen
> and what happens instead and why this is problematic.
>
> Till then, the patches are not going anywhere.
>
> Thanks!
Sorry for the confusion, I hope this cleared some things up.

Thanks,
Kazuki