Re: [PATCH] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs

From: Peter Zijlstra
Date: Mon Nov 08 2021 - 09:21:34 EST


On Mon, Nov 08, 2021 at 11:59:48AM +0000, Mel Gorman wrote:
> On Mon, Nov 08, 2021 at 12:14:27PM +0100, Peter Zijlstra wrote:
> > On Thu, Oct 28, 2021 at 02:03:05PM +0100, Mel Gorman wrote:
> >
> > > @@ -1926,8 +1926,8 @@ static void task_numa_find_cpu(struct task_numa_env *env,
> > > src_running = env->src_stats.nr_running - 1;
> > > dst_running = env->dst_stats.nr_running + 1;
> > > imbalance = max(0, dst_running - src_running);
> > > - imbalance = adjust_numa_imbalance(imbalance, dst_running,
> > > - env->dst_stats.weight);
> > > + imbalance = adjust_numa_imbalance(imbalance, env->dst_cpu,
> > > + dst_running, env->dst_stats.weight);
> >
> > Can we please align at (0 ?
> >
>
> i.e.
> imbalance = adjust_numa_imbalance(imbalance, env->dst_cpu,
> dst_running,
> env->dst_stats.weight);
>
> ?

Yep. For those using vim: :set cino=(0:0

Might as well clean that up while we touch the thing anyway.

> > >
> > > /* Use idle CPU if there is no imbalance */
> > > if (!imbalance) {
> >
> > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > > index 4e8698e62f07..08fb02510967 100644
> > > --- a/kernel/sched/topology.c
> > > +++ b/kernel/sched/topology.c
> > > @@ -644,6 +644,7 @@ static void destroy_sched_domains(struct sched_domain *sd)
> > > DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
> > > DEFINE_PER_CPU(int, sd_llc_size);
> > > DEFINE_PER_CPU(int, sd_llc_id);
> > > +DEFINE_PER_CPU(int, sd_numaimb_shift);
> >
> > Why does it make sense for this to be a per-cpu variable? Yes, I suppose
> > people can get creative with cpusets, but what you're trying to capture
> > seems like a global system propery, no?
> >
>
> I thought things might get weird around CPU hotplug and as llc_size was
> tracked per-cpu, I thought it made sense to also do it for
> sd_numaimb_shift.

Ah, there were performance arguments for llc_id (saves a bunch of
indirections trying to look up the LLC domain) and llc_size IIRC. While
in this case, the user actually has a struct sched_domain handy.


> > I'm thinking you can perhaps use something like:
> >
> > if (!(sd->flags & SD_SHARE_PKG_RESROUCES) &&
> > (child->flags & SD_SHARE_PKG_RESOURCES)) {
> >
> > /* this is the first domain not sharing LLC */
> > sd->new_magic_imb = /* magic incantation goes here */
> > }
>
> Thanks, I'll give it a shot and see what I come up with, it'll probably
> take me a few days to clear my table of other crud to focus on it.

Sure thing.