RE: Re: [PATCH] sched/fair: fix pick_eevdf to always find the correct se

From: Biju Das
Date: Thu Oct 05 2023 - 10:15:45 EST


Hi all,

> -----Original Message-----
> From: Biju Das
> Sent: Thursday, October 5, 2023 8:32 AM
> Subject: Re: [PATCH] sched/fair: fix pick_eevdf to always find the correct
> se
>
> Subject: Re: [PATCH] sched/fair: fix pick_eevdf to always find the correct
> se
> Date: Wed, 4 Oct 2023 22:39:39 +0200 [thread overview]
> Message-ID: <c92bc8a6-225d-4fd2-88b5-8994090fb2de@xxxxxxxxxxx> (raw)
> In-Reply-To: <xm261qego72d.fsf_-_@xxxxxxxxxx>
>
> Hi,
>
> On 30.09.2023 02:09, Benjamin Segall wrote:
> > The old pick_eevdf could fail to find the actual earliest eligible
> > deadline when it descended to the right looking for min_deadline, but
> > it turned out that that min_deadline wasn't actually eligible. In that
> > case we need to go back and search through any left branches we
> > skipped looking for the actual best _eligible_ min_deadline.
> >
> > This is more expensive, but still O(log n), and at worst should only
> > involve descending two branches of the rbtree.
> >
> > I've run this through a userspace stress test (thank you
> > tools/lib/rbtree.c), so hopefully this implementation doesn't miss any
> > corner cases.
> >
> > Fixes: 147f3efaa241 ("sched/fair: Implement an EEVDF-like scheduling
> > policy")
> > Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx>
>
> This patch causing issues [1] in Renesas RZ/G2L SMARC EVK platform.
> Reverting the patch fixes the warning messages
>
> [1]
> [ 25.550898] EEVDF scheduling fail, picking leftmost
>
> [ 15.109634] ======================================================
> [ 15.109636] WARNING: possible circular locking dependency detected
> [ 15.109641] 6.6.0-rc4-next-20231005-arm64-renesas-ga03f9ebbbb4c #1165
> Not tainted
> [ 15.109648] ------------------------------------------------------
> [ 15.109649] migration/0/16 is trying to acquire lock:
> [ 15.109654] ffff800081713460 (console_owner){..-.}-{0:0}, at:
> console_flush_all.constprop.0+0x1a0/0x438
> [ 15.109694]
> [ 15.109694] but task is already holding lock:
> [ 15.109697] ffff00007fbd2298 (&rq->__lock){-.-.}-{2:2}, at:
> __schedule+0xd0/0xbe0
> [ 15.109718]
> [ 15.109718] which lock already depends on the new lock.
> [ 15.109718]
> [ 15.109720]
> [ 15.109720] the existing dependency chain (in reverse order) is:
>
> 25.551560] __down_trylock_console_sem+0x34/0xb8
> [ 25.551567] console_trylock+0x24/0x74
> [ 25.551574] vprintk_emit+0x114/0x388
> [ 25.551581] vprintk_default+0x34/0x3c
> [ 25.551588] vprintk+0x9c/0xb4
> [ 25.551594] _printk+0x58/0x7c
> [ 25.551600] pick_next_task_fair+0x274/0x480
> [ 25.551608] __schedule+0x154/0xbe0
> [ 25.551616] schedule+0x48/0x110
> [ 25.551623] worker_thread+0x1b8/0x3f8
> [ 25.551630] kthread+0x114/0x118
> [ 25.551635] ret_from_fork+0x10/0x20
> [ OK ] Started System Logging Service.
> [ 26.099203] EEVDF scheduling fail, picking leftmost

Complete log can be found here

https://pastebin.com/zZkRFiSf

Cheers,
Biju