Re: 4.19.106-rt44 -- boot problems with irqwork: push most work into softirq context

From: Steven Rostedt
Date: Thu Mar 19 2020 - 20:49:03 EST


On Fri, 20 Mar 2020 00:22:25 +0100
Pavel Machek <pavel@xxxxxxx> wrote:

> On Thu 2020-03-19 22:48:35, Pavel Machek wrote:
> > Hi!

Hi Pavel!

> >
> > > I'm pleased to announce the 4.19.106-rt44 stable release.
> > >
> > >
> > > This release is just an update to the new stable 4.19.106 version
> > > and no RT specific changes have been made.
> > >
> > >
> > > You can get this release via the git tree at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git
> > >
> > > branch: v4.19-rt
> > > Head SHA1: 0f2960c75dd68d339f0aff2935f51652b5625fbf
> >
> > This brought some problems for me. de0-nano board now fails to boot in
> > cca 50% of cases if I move these patches on top of -cip tree.
> >
> > This is example of failed job:
> >
> > https://lava.ciplatform.org/scheduler/job/13037
> >
> > de0-nano is 32-bit arm, should be based on Altera SoCFPGA if I understand
> > things correctly.
> >
> > "fc9f4631a290 irqwork: push most work into softirq context" touches
> > area of the panic above. I tried to revert it on top of the full
> > series, and tests passed twice so far...
>
> Test passed 7 times now. So yes, reverting this fixes de0-nano
> boot. Any ideas what might be wrong?
>
> I'll be running it few more times.
>
> https://gitlab.com/cip-project/cip-kernel/linux-cip/pipelines/127953471
>

Looks like you are running this without PREEMPT_RT enabled.

Does this patch help?

-- Steve


diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 2940622da5b3..0ca75c77536b 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -146,8 +146,9 @@ bool irq_work_needs_cpu(void)
raised = this_cpu_ptr(&raised_list);
lazy = this_cpu_ptr(&lazy_list);

- if (llist_empty(raised) && llist_empty(lazy))
- return false;
+ if (llist_empty(raised) || arch_irq_work_has_interrupt())
+ if (llist_empty(lazy))
+ return false;

/* All work should have been flushed before going offline */
WARN_ON_ONCE(cpu_is_offline(smp_processor_id()));