Re: [PATCHv4 11/12] sched/core: Disable SD_ASYM_CPUCAPACITY for root_domains without asymmetry

From: Morten Rasmussen
Date: Thu Jul 05 2018 - 10:13:58 EST


On Thu, Jul 05, 2018 at 02:31:43PM +0100, Quentin Perret wrote:
> Hi Morten,
>
> On Wednesday 04 Jul 2018 at 11:17:49 (+0100), Morten Rasmussen wrote:
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index 71330e0e41db..29c186961345 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -1160,6 +1160,26 @@ sd_init(struct sched_domain_topology_level *tl,
> > sd_id = cpumask_first(sched_domain_span(sd));
> >
> > /*
> > + * Check if cpu_map eclipses cpu capacity asymmetry.
> > + */
> > +
> > + if (sd->flags & SD_ASYM_CPUCAPACITY) {
> > + int i;
> > + bool disable = true;
> > + long capacity = arch_scale_cpu_capacity(NULL, sd_id);
> > +
> > + for_each_cpu(i, sched_domain_span(sd)) {
> > + if (capacity != arch_scale_cpu_capacity(NULL, i)) {
> > + disable = false;
> > + break;
> > + }
> > + }
> > +
> > + if (disable)
> > + sd->flags &= ~SD_ASYM_CPUCAPACITY;
> > + }
> > +
> > + /*
> > * Convert topological properties into behaviour.
> > */
>
> If SD_ASYM_CPUCAPACITY means that some CPUs have different
> arch_scale_cpu_capacity() values, we could also automatically _set_
> the flag in sd_init() no ? Why should we let the arch set it and just
> correct it later ?
>
> I understand the moment at which we know the capacities of CPUs varies
> from arch to arch, but the arch code could just call
> rebuild_sched_domain when the capacities of CPUs change and let the
> scheduler detect things automatically. I mean, even if the arch code
> sets the flag in its topology level table, it will have to rebuild
> the sched domains anyway ...
>
> What do you think ?

We could as well set the flag here so the architecture doesn't have to
do it. It is a bit more complicated though for few reasons:

1. Detecting when to disable the flag is a lot simpler than checking
which level is should be set on. You basically have to work you way up
from the lowest topology level until you get to a level spanning all the
capacities available in the system to figure out where the flag should
be set. I don't think this fits easily with how we build the
sched_domain hierarchy. It can of course be done.

2. As you say, we still need the arch code (or cpufreq?) to rebuild the
whole thing once we know that the capacities have been determined. That
currently implies implementing arch_update_cpu_topology() which is
arch-specific. So we would need some arch code to make rebuild happen at
the right point in time. If the rebuild should be triggering the rebuild
we need another way to force a full rebuild. This can also be done.

3. Detecting the flag in generic kernel/sched/* code means that all
architectures will pay the for the overhead when building/rebuilding the
sched_domain hierarchy, and all architectures that sets the cpu
capacities to asymmetric will set the flag whether they like it or not.
I'm not sure if this is a problem.

In the end it is really about how much of this we want in generic code
and how much we hide in arch/, and if we dare to touch the sched_domain
build code ;-)

Morten