Re: SA_INTERRUPT

From: Andrea Arcangeli (andrea@suse.de)
Date: Sun Oct 01 2000 - 09:59:16 EST


On Sat, Sep 30, 2000 at 09:50:21PM -0400, Sandy Harris wrote:
> Don Becker has some text at:
>
> http://www.scyld.com/expert/irq-conflict.html
>
> which includes a section:
>
> > Why SA_INTERRUPT in the SCSI drivers is a Bad Thing
>
> > ... it could potentially have a very negative impact on all other interrupt-driven
> > kernel service. That includes just about everything ...
> >
> > I believe that very few complex devices can be correctly run by a device driver
> > that uses SA_INTERRUPT.
>
> So I grepped drivers/*/*.c in the nearest handy kernel source, which happened to be
> 2.2.16, and found 113 uses of SA_INTEERUPT, 64 in drivers/scsi/*.c and the rest spread
> around.

The reason SA_INTERRUPT was a bad thing is because of a plain bug in the kernel
that is been fixed in the recent 2.2.x kernels so it's not a bug anymore to use
SA_INTERRUPT and SA_SHIRQ at the same time in 2.2.x (at least this is true in
_most_ cases, see below).

If that would been true for all cases in 2.4.x we would have removed the
SA_INTERRUPT completly by replacing it with a __sti() at the top of the irq
handlers that wasn't registered with the SA_INTERRUPT flag set (that would have
saved some slow ugliness in a very fast path in the irq handling code).

But then I noticed a subtle case that consists of a device that generates a
flood of irqs and we can only stop it from the irq handler. If the irqs would
never be reentrant that wouldn't be a problem either (because the irq flood
would have no way to hurt us in first place), but on IA32 with edge triggered
io-apic the irqs _are_ reentrant (we don't mask the irq to not miss events: to
allow the nested irq to set the pending bitflag).

So in theory without a kind of SA_INTERRUPT set we could live lock before
reaching the irq handler in such a case. So if the irq is reentrant and
SA_INTERRUPT is set we should drop the SA_SHIRQ flag and fail the request_irq
if somebody else just registered for the irq (because the other irq handler
could be invoked before us and it could reenable irqs causing the live lock
situation). The information about irq reentrant or not should be provided by
the desc->status field.

So I'd probably prefer to replace the SA_INTERRUPT flag with a SA_IRQFLOOD and
to still drop the SA_INTERRUPT bitflag from all the request_irq by putting an
__sti() at the startup of all the irq handlersi that wasn't using SA_INTERRUPT.

This way the SA_IRQFLOOD checks will be done only at irq handler registration
time and we'll still avoid the SA_INTERRUPT ugly check in the irq handler very
fast path.

Then the device drivers that have that kind of problems with flood if irqs (if
they exists in first place) should register their irq with the SA_IRQFLOOD bit
set, and then with a desc->status IRQ_REENTRANT information (that is trivial to
collect out of the lowlevel irq drivers) we'll make sure not to share an irq
with SA_IRQFLOOD set when it sits on a reentrant irq line.

If somebody knows that such kind of irq-flood device doesn't exists that's fine
too of course, and in such case we don't need any SA_IRQFLOOD and IRQ_REENTRANT
information and we can just drop the SA_INTERRUPT flag and replace it with the
__sti where necessary as originally planned.

Comments?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Oct 07 2000 - 21:00:07 EST