Re: [PATCH v1] cpuidle: teo: Update idle duration estimate when choosing shallower state

From: Rafael J. Wysocki
Date: Thu Jul 27 2023 - 16:13:13 EST


On Thu, Jul 27, 2023 at 10:05 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> The TEO governor takes CPU utilization into account by refining idle state
> selection when the utilization is above a certain threshold. The idle state
> selection is then refined by choosing an idle state shallower than the
> previously selected one.
>
> However, when this is done, the idle duration estimate needs to be updated
> so as to prevent the scheduler tick from being stopped while the candidate
> idle state is shallow, which may lead to excessive energy usage if the CPU
> is not interrupted quickly enough going forward. Moreover, in case the
> scheduler tick has been stopped already and the new idle duration estimate
> is too small, the replacement candidate state cannot be used.
>
> Modify the relevant code to take the above observations into account.
>
> Fixes: 9ce0f7c4bc64 ("cpuidle: teo: Introduce util-awareness")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> @Peter: This doesn't attempt to fix the tick stopping problem, it just makes
> the current behavior consistent.
>
> @Anna-Maria: This is likely to basically prevent the tick from being stopped
> at all if the CPU utilization is above a certain threshold. I'm wondering if
> your results will be affected by it and in what way.
>
> ---
> drivers/cpuidle/governors/teo.c | 33 ++++++++++++++++++++++++++-------
> 1 file changed, 26 insertions(+), 7 deletions(-)
>
> Index: linux-pm/drivers/cpuidle/governors/teo.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/governors/teo.c
> +++ linux-pm/drivers/cpuidle/governors/teo.c
> @@ -397,13 +397,22 @@ static int teo_select(struct cpuidle_dri
> * the shallowest non-polling state and exit.
> */
> if (drv->state_count < 3 && cpu_data->utilized) {
> - for (i = 0; i < drv->state_count; ++i) {
> - if (!dev->states_usage[i].disable &&
> - !(drv->states[i].flags & CPUIDLE_FLAG_POLLING)) {
> - idx = i;
> + /*
> + * If state 0 is enabled and it is not a polling one, select it
> + * right away and update the idle duration estimate accordingly,
> + * unless the scheduler tick has been stopped.
> + */
> + if (!idx && !(drv->states[0].flags & CPUIDLE_FLAG_POLLING)) {
> + s64 span_ns = teo_middle_of_bin(0, drv);
> +
> + if (teo_time_ok(span_ns)) {
> + duration_ns = span_ns;
> goto end;
> }
> }
> + /* Assume that state 1 is not a polling one and select it. */

Well, I should also check if it is not disabled. Will send a v2 tomorrow.

> + idx = 1;
> + goto end;
> }
>
> /*
> @@ -539,10 +548,20 @@ static int teo_select(struct cpuidle_dri
>
> /*
> * If the CPU is being utilized over the threshold, choose a shallower
> - * non-polling state to improve latency
> + * non-polling state to improve latency, unless the scheduler tick has
> + * been stopped already and the shallower state's target residency is
> + * not sufficiently large.
> */
> - if (cpu_data->utilized)
> - idx = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
> + if (cpu_data->utilized) {
> + s64 span_ns;
> +
> + i = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
> + span_ns = teo_middle_of_bin(i, drv);
> + if (teo_time_ok(span_ns)) {
> + idx = i;
> + duration_ns = span_ns;
> + }
> + }
>
> end:
> /*
>
>
>