Re: [PATCH] sched/fair: Add SMT4 group_smt_balance handling

From: Tim Chen
Date: Thu Jul 27 2023 - 09:58:37 EST


On Wed, 2023-07-26 at 20:11 -0700, Tim Chen wrote:
> On Mon, 2023-07-17 at 20:28 +0530, Shrikanth Hegde wrote:
> > From: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> >
> > For SMT4, any group with more than 2 tasks will be marked as
> > group_smt_balance. Retain the behaviour of group_has_spare by marking
> > the busiest group as the group which has the least number of idle_cpus.
> >
> > Also, handle rounding effect of adding (ncores_local + ncores_busy)
> > when the local is fully idle and busy group has more than 2 tasks.
> > Local group should try to pull at least 1 task in this case.
> >
> > Originally-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > Signed-off-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 932e7b78894a..9502013abe33 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9532,7 +9532,7 @@ static inline long sibling_imbalance(struct lb_env *env,
> > imbalance /= ncores_local + ncores_busiest;
> >
> > /* Take advantage of resource in an empty sched group */
> > - if (imbalance == 0 && local->sum_nr_running == 0 &&
> > + if (imbalance <= 1 && local->sum_nr_running == 0 &&
> > busiest->sum_nr_running > 1)
> > imbalance = 2;
> >
> > @@ -9720,6 +9720,17 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > break;
> >
> > case group_smt_balance:
> > + /* no idle cpus on both groups handled by group_fully_busy below */
> > + if (sgs->idle_cpus != 0 || busiest->idle_cpus != 0) {
> > + if (sgs->idle_cpus > busiest->idle_cpus)
> > + return false;
> > + if (sgs->idle_cpus < busiest->idle_cpus)
> > + return true;
> > + if (sgs->sum_nr_running <= busiest->sum_nr_running)
> > + return false;
> > + }
> > + break;

Shrikanth and Peter,

Sorry, I acked Shrikanth's fixup patch too quickly without seeing that Shrikanth added
a "break" in the patch above. My original code did not have that break statement as
I did intend the code to fall through to the "group_fully_busy" code path when
there are no idle cpus in both groups. To make the compiler happy and putting
in the correct logic, I refresh the patch as below.

Thanks.

Tim

From: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Date: Fri, 14 Jul 2023 16:09:30 -0700
Subject: [PATCH] sched/fair: Add SMT4 group_smt_balance handling

For SMT4, any group with more than 2 tasks will be marked as
group_smt_balance. Retain the behaviour of group_has_spare by marking
the busiest group as the group which has the least number of idle_cpus.

Also, handle rounding effect of adding (ncores_local + ncores_busy)
when the local is fully idle and busy group has more than 2 tasks.
Local group should try to pull at least 1 task in this case.

Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
kernel/sched/fair.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a87988327f88..566686c5f2bd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9563,7 +9563,7 @@ static inline long sibling_imbalance(struct lb_env *env,
imbalance /= ncores_local + ncores_busiest;

/* Take advantage of resource in an empty sched group */
- if (imbalance == 0 && local->sum_nr_running == 0 &&
+ if (imbalance <= 1 && local->sum_nr_running == 0 &&
busiest->sum_nr_running > 1)
imbalance = 2;

@@ -9751,6 +9751,20 @@ static bool update_sd_pick_busiest(struct lb_env *env,
break;

case group_smt_balance:
+ /* no idle cpus on both groups handled by group_fully_busy below */
+ if (sgs->idle_cpus != 0 || busiest->idle_cpus != 0) {
+ if (sgs->idle_cpus > busiest->idle_cpus)
+ return false;
+ if (sgs->idle_cpus < busiest->idle_cpus)
+ return true;
+ if (sgs->sum_nr_running <= busiest->sum_nr_running)
+ return false;
+ else
+ return true;
+ }
+ goto fully_busy;
+ break;
+
case group_fully_busy:
/*
* Select the fully busy group with highest avg_load. In
@@ -9763,7 +9777,7 @@ static bool update_sd_pick_busiest(struct lb_env *env,
* select the 1st one, except if @sg is composed of SMT
* siblings.
*/
-
+fully_busy:
if (sgs->avg_load < busiest->avg_load)
return false;

--
2.32.0