Re: [PATCH] irqchip: omap-intc: fix spurious irq handling

From: Tony Lindgren
Date: Thu Dec 03 2015 - 10:39:40 EST


* Sekhar Nori <nsekhar@xxxxxx> [151203 07:25]:
> On Thursday 03 December 2015 08:32 PM, Tony Lindgren wrote:
> >
> > Yes we should naturally fix up the kernel locking.
>
> Alright. Thanks!
>
> >
> > Please also add something like "enable debug for more information"
> > to the warning. And then print out the current and previous interrupt
>
> So I am unconvinced (based on the debug above) that the previous
> interrupt information is actually giving any more useful information
> than what can be gleaned from observing /proc/interrupts. It seems
> previous interrupt noted can be any interrupt you would expect to occur
> during the test case anyway.

OK and the fact that I've fixed up 4-5 of these and all of them were
really caused by missing flush of posted write makes me still suspicious :)

> > if DEBUG is enabled. And in the comments mention that often the spurious
> > interrupts has been fixed by adding a flush of the posted write to the
> > previous interrupt handler in the device driver.
>
> I can add the comment, no problem.

OK thanks. We can add more debug once you figure out what is the root
cause.

> > Also, do you have a reproducable test case with mainline kernel I
> > could add to my collection of shell scripts?
>
> The way I reproduce this is to run the serial port at 3Mbaud in internal
> loopback mode with DMA enabled. The test program I use[1] compares the
> data sent and received byte-for-byte. With current mainline, that can
> mismatch pretty soon. The test will likely end before you see any
> spurious irq. There are some patches John Ogness is working on
> (currently included in TI's v4.1 kernel) which helps sustain the test
> for long and then actually expose the spurious irq issue.

OK. One thing you have to consider here though is that the EDMA driver
may still wrongly consider several interconnect targets as a single entity.
This can lead to issues where flushing a posted write really only flushes
one of the interconnect targets and that may not be the right one.

Peter has been patching the EDMA driver to solve this problem, but I don't
know if all of them are merged yet, I've added him to Cc.

My bets are on a lack of flush of posted write in the EDMA driver somewhere
and I suggest you investigate that a bit more considering the multiple
interconnect targets :)

Regards,

Tony

> [1] https://git.breakpoint.cc/cgit/bigeasy/serialcheck.git
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/