Re: [PATCH] sched/fair: fix sgc->{min,max}_capacity miscalculate

From: Peng Liu
Date: Fri Jan 03 2020 - 09:21:59 EST


On Wed, Jan 01, 2020 at 06:55:48PM +0000, Valentin Schneider wrote:
> On 01/01/2020 14:13, Peng Liu wrote:
> >> ---
> > --------------------------------------------------------------
> >> + else
> >> + cpu_cap = rq->sd->groups->sgc->capacity;
> >
> > sgc->capacity is the *sum* of all CPU's capacity in that group, right?
>
> Right
>
> > {min,max}_capacity are per CPU variables(*part* of a group). So we can't
> > compare *part* to *sum*. Am I overlooking something? Thanks.
> >
>
> AIUI rq->sd->groups->sgc->capacity should be the capacity of the rq's CPU
> (IOW the groups here should be made of single CPUs).
>
> This should be true regardless of overlapping domains, since they sit on top
> of the regular domains. Let me paint an example with a simple 2-core SMT2
> system:
>
> MC [ ]
> SMT [ ][ ]
> 0 1 2 3
>
> cpu_rq(0)->sd will point to the sched_domain of CPU0 at SMT level (it is the
> "base domain", IOW the lowest domain in the topology hierarchy). Its groups
> will be:
>
> {0} ----> {1}
> ^ /
> `-----'
>
> and sd->groups will point at the group spanning the "local" CPU, in our case
> CPU0, and thus here will be a group containing only CPU0.
>
>
> I do not know why sched_group_capacity is used here however. As I understand
> things, we could use cpu_capacity() unconditionally.

Valentin, sorry for my tardy reply, but there are always so many things
to do.

Thanks for your patient explanation, and the picture is intuitive
and clear. Indeed, the group in lowest domain only contains one CPU, and
the only CPU in the first group should be the rq's CPU. So, I wonder if
we can do like this?

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2d170b5da0e3..c9d7613c74d2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7793,9 +7793,6 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
*/

for_each_cpu(cpu, sched_group_span(sdg)) {
- struct sched_group_capacity *sgc;
- struct rq *rq = cpu_rq(cpu);
-
/*
* build_sched_domains() -> init_sched_groups_capacity()
* gets here before we've attached the domains to the
@@ -7807,15 +7804,11 @@ void update_group_capacity(struct sched_domain *sd, int cpu)
* This avoids capacity from being 0 and
* causing divide-by-zero issues on boot.
*/
- if (unlikely(!rq->sd)) {
- capacity += capacity_of(cpu);
- } else {
- sgc = rq->sd->groups->sgc;
- capacity += sgc->capacity;
- }
+ unsigned long cpu_cap = capacity_of(cpu);

- min_capacity = min(capacity, min_capacity);
- max_capacity = max(capacity, max_capacity);
+ min_capacity = min(cpu_cap, min_capacity);
+ max_capacity = max(cpu_cap, max_capacity);
+ capacity += cpu_cap;
}
} else {
/*