Re: [PATCH 3/4] sched/schedutil: Ignore update requests for short running tasks

From: Hongyan Xia
Date: Fri Dec 08 2023 - 05:42:54 EST


Hi Qais,

On 08/12/2023 01:52, Qais Yousef wrote:
Ignore freq updates to honour uclamp requests if the task is short
running. It won't run long enough to see the changes, so avoid the
unnecessary work and noise.

Make sure SCHED_CPUFREQ_PERF_HINTS flag is set in task_tick_fair() so
that we can do correction action if the task continued to run such that
it is no longer considered a short task.

Should address the problem of noisy short running tasks unnecessary
causing frequency spikes when waking up on a CPU that is running a busy
task capped by UCLAMP_MAX.

Actually, an occasional spike is not a big problem to me.

What is a big concern is a normal task and a uclamp_max task running on the same rq. If the uclamp_max task is 1024 but capped by uclamp_max at the lowest OPP, and the normal task has no uclamp but a duty cycle, then when the normal task wakes up on the rq, it'll be the highest OPP. When it sleeps, the ulamp_max is back and at the lowest OPP. This square-wave problem to me is a much bigger concern than an infrequent spike. If CONFIG_HZ is 1000, this square wave's frequency is 500 switching between highest and lowest OPP, which is definitely unacceptable.

The problem I think with filtering is, under this condition, should we filter out the lowest OPP or the highest? Neither sounds like a good answer because neither is a short-running task and the correct answer might be somewhere in between.

Sorry to ramble on this again and again, but I think filtering is addressing the symptom, not the cause. The cause is we have no idea under what condition a util_avg was achieved. The 1024 task in the previous example would be much better if we extend it into

[1024, achieved at uclamp_min 0, achieved at uclamp_max 300]

If we know 1024 was done under uclamp_max of 300, then we know we don't need to raise to the max OPP. So far, we carry around a lot of different new variables but not these two which we really need.


Move helper functions to access task_util_est() and related attributes
to sched.h to enable using it from cpufreq_schedutil.c

Signed-off-by: Qais Yousef (Google) <qyousef@xxxxxxxxxxx>
---
kernel/sched/cpufreq_schedutil.c | 59 ++++++++++++++++++++++++++++++++
kernel/sched/fair.c | 24 +------------
kernel/sched/sched.h | 22 ++++++++++++
3 files changed, 82 insertions(+), 23 deletions(-)

[...]