Re: Stopping the tick on a fully loaded system

From: Peter Zijlstra
Date: Wed Jul 26 2023 - 16:10:55 EST


On Wed, Jul 26, 2023 at 08:30:01PM +0200, Rafael J. Wysocki wrote:

> > - The governors teo and menu do the tick_nohz_next_event() check even if
> > the CPU is fully loaded and but the check is not for free.
>
> Let me have a loot at teo in that respect.
>
> The problem is when tick_nohz_get_sleep_length() should not be called.
> The easy case is when the governor would select the shallowest idle
> state without taking it into account, but what about the deeper ones?
> I guess this depends on the exit latency of the current candidate idle
> state, but what exit latency would be low enough? I guess 2 us would
> be fine, but what about 10 us, or even 20 us for that matter?

The patch I send here:

https://lkml.kernel.org/r/20230726164958.GV38236@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

(which was stuck in a mailqueue :/) tries to address that.

Additionally, I think we can do something like this on top of all that,
stop going deeper when 66% of wakeups is at or below the current state.


--- a/drivers/cpuidle/governors/teo.c
+++ b/drivers/cpuidle/governors/teo.c
@@ -362,6 +362,7 @@ static int teo_select(struct cpuidle_dri
unsigned int idx_hit_sum = 0;
unsigned int hit_sum = 0;
unsigned int tick_sum = 0;
+ unsigned int thresh_sum = 0;
int constraint_idx = 0;
int idx0 = 0, idx = -1;
bool alt_intercepts, alt_recent;
@@ -396,6 +397,8 @@ static int teo_select(struct cpuidle_dri
duration_ns = tick_nohz_get_sleep_length(&delta_tick);
cpu_data->sleep_length_ns = duration_ns;

+ thresh_sum = 2 * cpu_data->total / 3; /* 66% */
+
/*
* Find the deepest idle state whose target residency does not exceed
* the current sleep length and the deepest idle state not deeper than
@@ -426,6 +429,9 @@ static int teo_select(struct cpuidle_dri
if (s->target_residency_ns > duration_ns)
break;

+ if (intercept_sum + hit_sum > thresh_sum)
+ break;
+
idx = i;

if (s->exit_latency_ns <= latency_req)