RE: [EXT] Re: [PATCH] softirq: add irq off checking for __raise_softirq_irqoff

From: Jiafei Pan
Date: Fri Aug 14 2020 - 00:17:31 EST


> From: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Sent: Thursday, August 13, 2020 10:57 PM
>
> On Thu, 13 Aug 2020 03:03:46 +0000
> Jiafei Pan <jiafei.pan@xxxxxxx> wrote:
>
> > Any comments? Thanks.
> >
> > @Steven Rostedt, I thinks irq off checking is necessary especially
>
> This is probably more for Thomas Gleixner.
>
> > for Preempt-RT kernel, because some context may be changed from irq
> > off to irq on when enable Preempt RT, I once met a issue that hrtimer
> > soft irq is lost when enabled Preempt RT, finally I found
> > napi_schedule_irqoff is called in hardware interrupt handler, there
> > maybe no issue for non RT kernel, but for Preempt RT, interrupt is
> > threaded, so irq is on in interrupt handler, the result is
> > __raise_softirq_irqoff is called in irq on context, so that per-CPU
> > softirq masking is corrupted because of the process of updating of
> > soft irq masking is interrupted and not a atomic operation , and then
> > caused hrtimer soft irq is lost. So I think adding irq status checking
> > in __raise_softirq_irqoff can report such issue directly and help us
> > to find the root cause of such issue.
> >
> > I know that there may be performance impaction to add extra checking
> > here, if it is the case, how about to include it in some debug
> > configuration items? Such as CONFIG_DEBUG_PREEMPT or other debug
> > items?
> >
>
>
> > Best Regards,
> > Jiafei.
> >
> > -----Original Message-----
> > From: Jiafei Pan <Jiafei.Pan@xxxxxxx>
> > Sent: Thursday, August 6, 2020 12:07 PM
> > To: peterz@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; tglx@xxxxxxxxxxxxx;
> > rostedt@xxxxxxxxxxx; romain.perier@xxxxxxxxx; will@xxxxxxxxxx
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-rt-users@xxxxxxxxxxxxxxx;
> > Jiafei Pan <jiafei.pan@xxxxxxx>; Leo Li <leoyang.li@xxxxxxx>; Vladimir
> > Oltean <vladimir.oltean@xxxxxxx>; Jiafei Pan <jiafei.pan@xxxxxxx>
> > Subject: [PATCH] softirq: add irq off checking for
> > __raise_softirq_irqoff
> >
> > __raise_softirq_irqoff will update per-CPU mask of pending softirqs, it need
> to be called in irq disabled context in order to keep it atomic operation,
> otherwise it will be interrupted by hardware interrupt, and per-CPU softirqs
> pending mask will be corrupted, the result is there will be unexpected issue,
> for example hrtimer soft irq will be losed and soft hrtimer will never be expire
> and handled.
>
> Please wrap your change logs.
[Jiafei Pan] Thanks, will update it.
>
> >
> > Adding irqs disabled checking here to provide warning in irqs enabled
> context.
> >
> > Signed-off-by: Jiafei Pan <Jiafei.Pan@xxxxxxx>
> > ---
> > kernel/softirq.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/kernel/softirq.c b/kernel/softirq.c index
> > bf88d7f62433..11f61e54a3ae 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -481,6 +481,11 @@ void raise_softirq(unsigned int nr)
> >
> > void __raise_softirq_irqoff(unsigned int nr) {
> > + /* This function can only be called in irq disabled context,
> > + * otherwise or_softirq_pending will be interrupted by hardware
> > + * interrupt, so that there will be unexpected issue.
> > + */
> > + WARN_ON_ONCE(!irqs_disabled());
>
> Perhaps: lockdep_assert_irqs_disabled() is more appropriate, and doesn't add
> extra overhead on production systems.
>
> -- Steve
[Jiafei Pan] Thanks, will update it.
>
>
> > trace_softirq_raise(nr);
> > or_softirq_pending(1UL << nr);
> > }
> > --
> > 2.17.1