Re: [PATCH v2] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

From: Phil Auld
Date: Thu Mar 21 2019 - 14:32:29 EST


On Thu, Mar 21, 2019 at 07:01:37PM +0100 Peter Zijlstra wrote:
> On Tue, Mar 19, 2019 at 09:00:05AM -0400, Phil Auld wrote:
> > sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup
> >
> > With extremely short cfs_period_us setting on a parent task group with a large
> > number of children the for loop in sched_cfs_period_timer can run until the
> > watchdog fires. There is no guarantee that the call to hrtimer_forward_now()
> > will ever return 0. The large number of children can make
> > do_sched_cfs_period_timer() take longer than the period.
>
> >
> > To prevent this we add protection to the loop that detects when the loop has run
> > too many times and scales the period and quota up, proportionally, so that the timer
> > can complete before then next period expires. This preserves the relative runtime
> > quota while preventing the hard lockup.
> >
> > A warning is issued reporting this state and the new values.
> >
> > v2: Math reworked/simplified by Peter Zijlstra.
> >
> > Signed-off-by: Phil Auld <pauld@xxxxxxxxxx>
> > Cc: Ben Segall <bsegall@xxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > Cc: Anton Blanchard <anton@xxxxxxxxxx>
>
> Thanks!

Thank you for your time and help.

What do you think about Cc: stable?


Cheers,
Phil

--