Re: [PATCH 2/6] g_NCR5380: Test the IRQ before accepting it

From: Finn Thain
Date: Wed Nov 02 2016 - 22:17:00 EST



On Wed, 2 Nov 2016, Ondrej Zary wrote:

> On Wednesday 02 November 2016 08:45:26 Finn Thain wrote:
> > On Mon, 31 Oct 2016, Ondrej Zary wrote:
> > > Trigger an IRQ first with a test IRQ handler to find out if it
> > > really works. Disable the IRQ if not.
> > >
> > > This prevents hang when incorrect IRQ was specified by user.
> >
> > Once again, how does it cause a hang?
>
> Kernel scans the bus, finds a HDD, then attempts to read MBR. modprobe
> process is stuck but the system is still running. Then the transfer
> probably times out and everything locks up hard, even fbcon cursor stops
> blinking. I guess that kernel is trying to abort or reset.

I don't think this issue relates to the patch, because the chip irq is not
needed for exception handling.

A backtrace from the soft lockup detector should help explain this.

> BTW. rescan-scsi-bus also causes hang, anytime, even without IRQ.

I would try "scsi_logging_level -s -a 7" to find out what is going on
during the bus scan (for modprobe or rescan-scsi-bus).

The polling loops in generic_NCR5380_pread/pwrite can cause a lockup
because they lack timeouts. Better to call NCR5380_poll_politely, as in
macscsi_pread/pwrite.

--