Re: A question concerning time outs and possible lost interrupts

Gerard Roudier (groudier@club-internet.fr)
Mon, 14 Sep 1998 00:06:16 +0200 (MET DST)


On Sun, 13 Sep 1998, Linus Torvalds wrote:

> The correct thing for a driver to do is to do one of two things:
>
> - always require the interrupt to be level-triggered. This is the "good"
> solution, but sadly there are setups out there that still use
> edge-triggered interrupts. Don't ask me why.
>
> One option migth be to have a way of overriding the edge/level decision
> that the BIOS made for us. We could just decide that even though the
> BIOS told us to use an edge interrupt, the BIOS is just wrong.
>
> However, sometimes the BIOS might just be right. Certainly some really
> stupid devices require a edge interrupt simply because they don't
> de-assert their own interrupt line in any sane manner (this is true of
> the timer interrupt, for example, which is just a square-wave thing and
> thus would generate an endless stream of interrupts for 50% of the time
> if it was level-triggered - and there may be other broken hardware out
> there with the same bad behaviour)
>
> - if there are multiple status mailboxes like the above, the driver has
> to go through each and every one endlessly until it has gone one
> complete round without seeing a single event (this assumes that events
> don't go away once they are posted - and thus the "not seeing a single
> event" guarantees that at some point in time all events were quiescent
> and thus that we will get a new edge if some new event ever happens)
>
> I don't know whether the ncr driver does this already or not, but if it
> doesn't, then that may be the cause for the occasional timeouts..

The 3.0x ncr drivers ensures not to lose interrupts only for level
sensitive (triggered) interrupts. (In my opinion, edge triggered
interrupts need the controller to stop after having raised an interrupt
and to wait for the C code to restart it in order not to lose events).

1 - Read the Interrupt Status Register (ISTAT)
If completion interrupt (INTFLY)
2 - Write the ISTAT to clear the interrupt condition.
3 - Reread the ISTAT. This read will ensure that PCI posted writes
that may have occured between (1) and (2) are flushed and that the
Interrupt condition is actually cleared.
(This seems overcommitting, but hopefully it is not)
4 - Scan the completion queue.

Between (1) and (2) the controller may have written to memory some
completion data and these transactions may be posted.
The write to the ISTAT (2) may also be posted.
(3) ensures that all this stuff will be actually visible by the
corresponding parts at the moment the completion queue is scanned
by the C code.

So, if the C code misses a complete command due to posted write,
the controller will raise again the interrupt condition and
the C code will catch it the next time.

Sometimes, I think of the following, without being quite sure:
Chipsets generally flushes buffers before delivering interrupts.
This allows PCI drivers that donnot take care of posted writes to
work most of the time. But when such devices are sharing interrupts
or when the C code is late handling interrupts so that several
completions have been signaled when the ISR is entered, such drivers
may lose completion events.

Regards,
Gerard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html