[PATCH v2 0/6] cpufreq: schedutil: fixes for flags updates

From: Patrick Bellasi
Date: Tue Jul 04 2017 - 13:34:52 EST


Each time a CPU utilisation update is issued by the scheduler a flag, which
mainly defines which scheduling class is asking for the update, is used by the
frequency selection policy to support the selection of the most appropriate
OPP.

In the current implementation, CPU flags are overridden each time the scheduler
calls schedutil for an update. Such a behavior seems to be sub-optimal,
especially on systems where frequency domains span across multiple CPUs.

Indeed, assuming CPU1 and CPU2 share the same frequency domain, there can be
the following issues:

A) Small FAIR task running at MAX OPP.
A RT task, which just executed on CPU1, can keep the domain at the
max frequency for a prolonged period of time after its completion,
even if there are no longer RT tasks running on CPUs of its domain.

B) FAIR wakeup reducing the OPP of the current RT task.
A FAIR task enqueued in a CPU where a RT task is running overrides the flag
configured by the RT task thus potentially causing an unwanted frequency
drop.

C) RT wakeup not running at max OPP.
An RT task waking up on a CPU which has recently updated its OPP can
be forced to run at a lower frequency because of the throttling
enforced by schedutil, even if there are not OPP transitions
currently in progress.

.:: Patches organization
========================

This series proposes a set of fixes for the aforementioned issues and it's an
update addressing all the main comments collected from the previous posting
[1].

Patches have been re-ordered to have the "less controversial" bits at the
beginning and also to better match the order of the three main issues described
above. These are the relative patches:

A) Fix small FAIR task running at MAX OPP:
cpufreq: schedutil: ignore the sugov kthread for frequencies selections
cpufreq: schedutil: reset sg_cpus's flags at IDLE enter

B) FAIR wakeup reducing the OPP of the current RT task.
cpufreq: schedutil: ensure max frequency while running RT/DL tasks

C) RT wakeup not running at max OPP.
sched/rt: fast switch to maximum frequency when RT tasks are scheduled
cpufreq: schedutil: relax rate-limiting while running RT/DL tasks
cpufreq: schedutil: avoid utilisation update when not necessary

.:: Experimental Results
========================

The misbehavior have been verified using a set of simple rt-app based synthetic
workloads, running on a ARM's Juno R2 board where the CPUs of the big cluster
(CPU1 and CPU2) have been reserved to run the workload tasks in isolation from
other system tasks.

A detailed description of the experiments executed, and the corresponding
collected results, is available [2] online.

Short highlights for these experiments are:

- Patches in group A reduce energy consumption by ~50% by ensuring that
a small task is always running at the minimum OPP even when the
sugov's RT kthread is used to change frequencies in the same cluster.

- Patches in group B increase from 4% to 98% the chances for a RT
task to complete its activations while running at the max OPP.

- Patches in group C do not show measurable differences mainly because of the
slow OPP switching support available on the JUNO board used for testing.
However, a trace inspection shows that the sequence of traced events is much
more deterministic and it better matches the expected system behaviors.
For example, as soon as a RT task wakeup the scheduler ask for an OPP switch
to max frequency.

Cheers Patrick

.:: References
==============

[1] https://lkml.org/lkml/2017/3/2/385
[2] https://gist.github.com/derkling/0cd7210e4fa6f2ec3558073006e5ad70


Patrick Bellasi (6):
cpufreq: schedutil: ignore sugov kthreads
cpufreq: schedutil: reset sg_cpus's flags at IDLE enter
cpufreq: schedutil: ensure max frequency while running RT/DL tasks
cpufreq: schedutil: update CFS util only if used
sched/rt: fast switch to maximum frequency when RT tasks are scheduled
cpufreq: schedutil: relax rate-limiting while running RT/DL tasks

include/linux/sched/cpufreq.h | 1 +
kernel/sched/cpufreq_schedutil.c | 61 ++++++++++++++++++++++++++++++++--------
kernel/sched/idle_task.c | 4 +++
kernel/sched/rt.c | 15 ++++++++--
4 files changed, 67 insertions(+), 14 deletions(-)

--
2.7.4