[PATCH] sched/fair: Do not wakeup-preempt same-prio SCHED_OTHER tasks

From: Ingo Molnar
Date: Tue Sep 19 2023 - 05:02:36 EST



* Mike Galbraith <efault@xxxxxx> wrote:

> On Tue, 2023-08-22 at 08:33 +0530, K Prateek Nayak wrote:
> > Hello Mike,
>
> Greetings!
>
> > > FWIW, there are more tbench shards lying behind EEVDF than in front.
> > >
> > > tbench 8 on old i7-4790 box
> > > 4.4.302      4024
> > > 6.4.11       3668
> > > 6.4.11-eevdf 3522
> > >
> >
> > I agree, but on servers, tbench has been useful to identify a variety of
> > issues [1][2][3] and I believe it is better to pick some shards up than
> > leave them lying around for others to step on :)
>
> Absolutely, but in this case it isn't due to various overheads wiggling
> about and/or bitrot, everything's identical except the scheduler, and
> its overhead essentially is too.
>
> taskset -c 3 pipe-test
> 6.4.11 1.420033 usecs/loop -- avg 1.420033 1408.4 KHz
> 6.4.11-eevdf 1.413024 usecs/loop -- avg 1.413024 1415.4 KHz
>
> Methinks these shards are due to tbench simply being one of those
> things that happens to like the CFS notion of short term fairness a bit
> better than the EEVDF notion, ie are inevitable fallout tied to the
> very thing that makes EEVDF service less spiky that CFS, and thus will
> be difficult to sweep up.
>
> Too bad I didn't save Peter's test hack to make EEVDF use the same
> notion of fair (not a keeper) as I think that would likely prove it.

BTW., if overscheduling is still an issue, I'm wondering whether we
could go so far as to turn off wakeup preemption for same-prio
SCHED_OTHER tasks altogether, as per the attached patch?

What does this do to your various tests? Test booted only.

Thanks,

Ingo

=============>
From: Ingo Molnar <mingo@xxxxxxxxxx>
Date: Tue, 19 Sep 2023 10:49:51 +0200
Subject: [PATCH] sched/fair: Do not wakeup-preempt same-prio SCHED_OTHER tasks

Reduce overscheduling some more: do not wakeup-preempt same-priority
SCHED_OTHER tasks.

Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a502e3255392..98efe01c8e4e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8042,7 +8042,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
* Batch and idle tasks do not preempt non-idle tasks (their preemption
* is driven by the tick):
*/
- if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION))
+ if (unlikely(p->policy != SCHED_NORMAL) || likely(p->prio == curr->prio) || !sched_feat(WAKEUP_PREEMPTION))
return;

find_matching_se(&se, &pse);