Re: [PATCH v8 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path

From: Chen Yu
Date: Mon Jun 12 2023 - 01:22:45 EST


On 2023-06-12 at 10:31:39 +0530, Gautham R. Shenoy wrote:
> Hello Yicong,
>
>
> On Tue, May 30, 2023 at 03:02:53PM +0800, Yicong Yang wrote:
> > From: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> [..snip..]
>
> > @@ -7103,7 +7127,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > bool has_idle_core = false;
> > struct sched_domain *sd;
> > unsigned long task_util, util_min, util_max;
> > - int i, recent_used_cpu;
> > + int i, recent_used_cpu, prev_aff = -1;
> >
> > /*
> > * On asymmetric system, update task utilization because we will check
> > @@ -7130,8 +7154,11 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > */
> > if (prev != target && cpus_share_cache(prev, target) &&
> > (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> > - asym_fits_cpu(task_util, util_min, util_max, prev))
> > - return prev;
> > + asym_fits_cpu(task_util, util_min, util_max, prev)) {
> > + if (cpus_share_lowest_cache(prev, target))
>
> For platforms without the cluster domain, the cpus_share_lowest_cache
> check is a repetition of the cpus_share_cache(prev, target) check. Can
> we avoid this using a static branch check for cluster ?
>
>
Sounds good.
> > + return prev;
> > + prev_aff = prev;
> > + }
> >
> > /*
> > * Allow a per-cpu kthread to stack with the wakee if the
> > @@ -7158,7 +7185,10 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
> > cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) &&
> > asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
> > - return recent_used_cpu;
> > + if (cpus_share_lowest_cache(recent_used_cpu, target))
>
> Same here.
>
> > + return recent_used_cpu;
> > + } else {
> > + recent_used_cpu = -1;
> > }
> >
> > /*
> > @@ -7199,6 +7229,17 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > if ((unsigned)i < nr_cpumask_bits)
> > return i;
> >
> > + /*
> > + * For cluster machines which have lower sharing cache like L2 or
> > + * LLC Tag, we tend to find an idle CPU in the target's cluster
> > + * first. But prev_cpu or recent_used_cpu may also be a good candidate,
> > + * use them if possible when no idle CPU found in select_idle_cpu().
> > + */
> > + if ((unsigned int)prev_aff < nr_cpumask_bits)
> > + return prev_aff;
>
> Shouldn't we check if prev_aff (and the recent_used_cpu below) is
> still idle ?
>
>
When we reach here, the target is non-idle, and the prev_aff is idle.
Although there is a race condition that prev_aff becomes non-idle
and target becomes idle after select_idle_cpu(), this window might be
small IMO.

thanks,
Chenyu
> > + if ((unsigned int)recent_used_cpu < nr_cpumask_bits)
> > + return recent_used_cpu;
> > +
> > return target;
> > }
> >
>
> --
> Thanks and Regards
> gautham.