[PATCH v1] cpuidle: teo: Update idle duration estimate when choosing shallower state

From: Rafael J. Wysocki
Date: Thu Jul 27 2023 - 16:05:40 EST


From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

The TEO governor takes CPU utilization into account by refining idle state
selection when the utilization is above a certain threshold. The idle state
selection is then refined by choosing an idle state shallower than the
previously selected one.

However, when this is done, the idle duration estimate needs to be updated
so as to prevent the scheduler tick from being stopped while the candidate
idle state is shallow, which may lead to excessive energy usage if the CPU
is not interrupted quickly enough going forward. Moreover, in case the
scheduler tick has been stopped already and the new idle duration estimate
is too small, the replacement candidate state cannot be used.

Modify the relevant code to take the above observations into account.

Fixes: 9ce0f7c4bc64 ("cpuidle: teo: Introduce util-awareness")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
---

@Peter: This doesn't attempt to fix the tick stopping problem, it just makes
the current behavior consistent.

@Anna-Maria: This is likely to basically prevent the tick from being stopped
at all if the CPU utilization is above a certain threshold. I'm wondering if
your results will be affected by it and in what way.

---
drivers/cpuidle/governors/teo.c | 33 ++++++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 7 deletions(-)

Index: linux-pm/drivers/cpuidle/governors/teo.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/governors/teo.c
+++ linux-pm/drivers/cpuidle/governors/teo.c
@@ -397,13 +397,22 @@ static int teo_select(struct cpuidle_dri
* the shallowest non-polling state and exit.
*/
if (drv->state_count < 3 && cpu_data->utilized) {
- for (i = 0; i < drv->state_count; ++i) {
- if (!dev->states_usage[i].disable &&
- !(drv->states[i].flags & CPUIDLE_FLAG_POLLING)) {
- idx = i;
+ /*
+ * If state 0 is enabled and it is not a polling one, select it
+ * right away and update the idle duration estimate accordingly,
+ * unless the scheduler tick has been stopped.
+ */
+ if (!idx && !(drv->states[0].flags & CPUIDLE_FLAG_POLLING)) {
+ s64 span_ns = teo_middle_of_bin(0, drv);
+
+ if (teo_time_ok(span_ns)) {
+ duration_ns = span_ns;
goto end;
}
}
+ /* Assume that state 1 is not a polling one and select it. */
+ idx = 1;
+ goto end;
}

/*
@@ -539,10 +548,20 @@ static int teo_select(struct cpuidle_dri

/*
* If the CPU is being utilized over the threshold, choose a shallower
- * non-polling state to improve latency
+ * non-polling state to improve latency, unless the scheduler tick has
+ * been stopped already and the shallower state's target residency is
+ * not sufficiently large.
*/
- if (cpu_data->utilized)
- idx = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
+ if (cpu_data->utilized) {
+ s64 span_ns;
+
+ i = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
+ span_ns = teo_middle_of_bin(i, drv);
+ if (teo_time_ok(span_ns)) {
+ idx = i;
+ duration_ns = span_ns;
+ }
+ }

end:
/*