Re: [PATCH 4.4 173/268] sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning

From: Steven Rostedt
Date: Thu Jun 14 2018 - 18:32:19 EST


On Thu, 14 Jun 2018 22:55:56 +0100
Ben Hutchings <ben.hutchings@xxxxxxxxxxxxxxx> wrote:

> On Mon, 2018-05-28 at 12:02 +0200, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.ÂÂIf anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> >
> > [ Upstream commit d29a20645d5e929aa7e8616f28e5d8e1c49263ec ]
> >
> > While running rt-tests' pi_stress program I got the following splat:
> >
> > Â rq->clock_update_flags < RQCF_ACT_SKIP
> > Â WARNING: CPU: 27 PID: 0 at kernel/sched/sched.h:960 assert_clock_updated.isra.38.part.39+0x13/0x20
> >
> > Â [...]
> >
> > Â <IRQ>
> > Â enqueue_top_rt_rq+0xf4/0x150
> > Â ? cpufreq_dbs_governor_start+0x170/0x170
> > Â sched_rt_rq_enqueue+0x65/0x80
> > Â sched_rt_period_timer+0x156/0x360
> > Â ? sched_rt_rq_enqueue+0x80/0x80
> > Â __hrtimer_run_queues+0xfa/0x260
> > Â hrtimer_interrupt+0xcb/0x220
> > Â smp_apic_timer_interrupt+0x62/0x120
> > Â apic_timer_interrupt+0xf/0x20
> > Â </IRQ>
> >
> > Â [...]
> >
> > Â do_idle+0x183/0x1e0
> > Â cpu_startup_entry+0x5f/0x70
> > Â start_secondary+0x192/0x1d0
> > Â secondary_startup_64+0xa5/0xb0
> >
> > We can get rid of it be the "traditional" means of adding an
> > update_rq_clock() call after acquiring the rq->lock in
> > do_sched_rt_period_timer().
> >
> > The case for the RT task throttling (which this workload also hits)
> > can be ignored in that the skip_update call is actually bogus and
> > quite the contrary (the request bits are removed/reverted).
> >
> > By setting RQCF_UPDATED we really don't care if the skip is happening
> > or not and will therefore make the assert_clock_updated() check happy.
>
> There is no such flag or assertion in 4.4 or 4.9, so does this change
> still make sense there?

I believe the assert was added to catch bugs like this.

Although the change log is a bit ambiguous in if it is fixing an actual
miss update, or if it is just quieting a false positive.

Davidlohr?

-- Steve


>
> Ben.
>
> > Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>
> > Reviewed-by: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
> > Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Mike Galbraith <efault@xxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: dave@xxxxxxxxxxxx
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Cc: rostedt@xxxxxxxxxxx
> > Link: http://lkml.kernel.org/r/20180402164954.16255-1-dave@xxxxxxxxxxxx
> > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> > Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > ---
> > Âkernel/sched/rt.c |ÂÂÂÂ2 ++
> > Â1 file changed, 2 insertions(+)
> >
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -822,6 +822,8 @@ static int do_sched_rt_period_timer(stru
> > Â struct rq *rq = rq_of_rt_rq(rt_rq);
> > Â
> > Â raw_spin_lock(&rq->lock);
> > + update_rq_clock(rq);
> > +
> > Â if (rt_rq->rt_time) {
> > Â u64 runtime;
> > Â
> >
> >
> >