Re: [tip:sched/eevdf] [sched/fair] e0c2ff903c: phoronix-test-suite.blogbench.Write.final_score -34.8% regression

From: Mike Galbraith
Date: Wed Aug 16 2023 - 11:39:28 EST


On Wed, 2023-08-16 at 15:40 +0200, Peter Zijlstra wrote:
> On Wed, Aug 16, 2023 at 02:37:16PM +0200, Peter Zijlstra wrote:
> > On Mon, Aug 14, 2023 at 08:32:55PM +0200, Mike Galbraith wrote:
> >
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -875,6 +875,12 @@ static struct sched_entity *pick_eevdf(s
> > >         if (curr && (!curr->on_rq || !entity_eligible(cfs_rq, curr)))
> > >                 curr = NULL;
> > >  
> > > +       /*
> > > +        * Once selected, run the task to parity to avoid overscheduling.
> > > +        */
> > > +       if (sched_feat(RUN_TO_PARITY) && curr)
> > > +               return curr;
> > > +
> > >         while (node) {
> > >                 struct sched_entity *se = __node_2_se(node);
> > >  
> >
> > So I read it wrong last night... but I rather like this idea. But
> > there's something missing. When curr starts a new slice it should
> > probably do a full repick and not stick with it.
> >
> > Let me poke at this a bit.. nice
>
> Something like so.. it shouldn't matter much now, but might make a
> difference once we start mixing different slice lengths.

Hm, that stash the deadline trick _seems_ to have cured the reason I
was inspired to added that XXX hunk.. no 'ew, that's a tad harsh'
latency penalty in sight <knocks wood>.

Here's hoping test bots don't have a cow.

>
> ---
>  kernel/sched/fair.c     | 12 ++++++++++++
>  kernel/sched/features.h |  1 +
>  2 files changed, 13 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fe5be91c71c7..128a78f3f264 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -873,6 +873,13 @@ static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
>         if (curr && (!curr->on_rq || !entity_eligible(cfs_rq, curr)))
>                 curr = NULL;
>  
> +       /*
> +        * Once selected, run a task until it either becomes non-eligible or
> +        * until it gets a new slice. See the HACK in set_next_entity().
> +        */
> +       if (sched_feat(RUN_TO_PARITY) && curr && curr->vlag == curr->deadline)
> +               return curr;
> +
>         while (node) {
>                 struct sched_entity *se = __node_2_se(node);
>  
> @@ -5168,6 +5175,11 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
>                 update_stats_wait_end_fair(cfs_rq, se);
>                 __dequeue_entity(cfs_rq, se);
>                 update_load_avg(cfs_rq, se, UPDATE_TG);
> +               /*
> +                * HACK, stash a copy of deadline at the point of pick in vlag,
> +                * which isn't used until dequeue.
> +                */
> +               se->vlag = se->deadline;
>         }
>  
>         update_stats_curr_start(cfs_rq, se);
> diff --git a/kernel/sched/features.h b/kernel/sched/features.h
> index 61bcbf5e46a4..f770168230ae 100644
> --- a/kernel/sched/features.h
> +++ b/kernel/sched/features.h
> @@ -6,6 +6,7 @@
>   */
>  SCHED_FEAT(PLACE_LAG, true)
>  SCHED_FEAT(PLACE_DEADLINE_INITIAL, true)
> +SCHED_FEAT(RUN_TO_PARITY, true)
>  
>  /*
>   * Prefer to schedule the task we woke last (assuming it failed