Re: [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling()

From: Chen Yu
Date: Thu Jun 01 2023 - 12:44:41 EST


On Thu, Jun 1, 2023 at 8:11 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Jun 01, 2023 at 03:03:39PM +0530, K Prateek Nayak wrote:
[...]
> > I wonder if extending SIS_UTIL for SIS_NODE would help some of these
> > cases but I've not tried tinkering with it yet. I'll continue
> > testing on other NPS modes which would decrease the search scope.
> > I'll also try running the same bunch of workloads on an even larger
> > 4th Generation EPYC server to see if the behavior there is similar.
>
> > > /*
> > > + * For the multiple-LLC per node case, make sure to try the other LLC's if the
> > > + * local LLC comes up empty.
> > > + */
> > > +static int
> > > +select_idle_node(struct task_struct *p, struct sched_domain *sd, int target)
> > > +{
> > > + struct sched_domain *parent = sd->parent;
> > > + struct sched_group *sg;
> > > +
> > > + /* Make sure to not cross nodes. */
> > > + if (!parent || parent->flags & SD_NUMA)
> > > + return -1;
> > > +
> > > + sg = parent->groups;
> > > + do {
> > > + int cpu = cpumask_first(sched_group_span(sg));
> > > + struct sched_domain *sd_child;
> > > +
> > > + sd_child = per_cpu(sd_llc, cpu);
> > > + if (sd_child != sd) {
> > > + int i = select_idle_cpu(p, sd_child, test_idle_cores(cpu), cpu);
>
> Given how SIS_UTIL is inside select_idle_cpu() it should already be
> effective here, no?
>
I'm thinking of this scenario, when the system is overloaded and with
SIS_NODE disabled,
the SIS_UTIL could scan for example 4 CPUs and terminates, then wakeup
on local LLC.
When SIS_NODE is enabled, it could scan for 4 * number_of_llc_domain
CPUs. The more
CPU it scans, the more likely it can find an idle CPU.
This seems to be a question of: what type of wakee is prefered to be
put on a non-idle CPU from local LLC,
or an idle CPU from remote LLC. It seems to depend on the working
set and task duration.


thanks,
Chenyu