Re: [RFC PATCH 3/3] sched/fair: Traverse cpufreq policies to detect capacity inversion

From: Qais Yousef
Date: Sat Dec 03 2022 - 09:33:34 EST


On 12/02/22 15:57, Vincent Guittot wrote:

> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 7c0dd57e562a..4bbbca85134b 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8856,23 +8856,20 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu)
> > * * Thermal pressure will impact all cpus in this perf domain
> > * equally.
> > */
> > - if (sched_energy_enabled()) {
> > + if (static_branch_unlikely(&sched_asym_cpucapacity)) {
> > unsigned long inv_cap = capacity_orig - thermal_load_avg(rq);
> > - struct perf_domain *pd = rcu_dereference(rq->rd->pd);
> > + struct cpufreq_policy *policy, __maybe_unused *policy_n;
> >
> > rq->cpu_capacity_inverted = 0;
> >
> > - SCHED_WARN_ON(!rcu_read_lock_held());
> > -
> > - for (; pd; pd = pd->next) {
> > - struct cpumask *pd_span = perf_domain_span(pd);
> > + for_each_active_policy_safe(policy, policy_n) {
>
> So you are looping all cpufreq policy (and before the perf domain) in
> the period load balance. That' really not something we should or want
> to do

Why is it not acceptable in the period load balance but acceptable in the hot
wake up path in feec()? What's the difference?


Thanks!

--
Qais Yousef