Re: 2.2.14 SMP 3com905: transmit timed out: Odd lost irq and ip-stack lockup

From: Maciej W. Rozycki (macro@ds2.pg.gda.pl)
Date: Thu Oct 12 2000 - 10:00:28 EST


On Fri, 13 Oct 2000, Andrew Morton wrote:

> > Oct 9 17:29:02 fwintern kernel: eth0: Interrupt posted but not
> > delivered -- IRQ blocked by another device?
>
> This is the infamous APIC bug. I have about ten reports of this over a
> four-month period. Mark Hemment mentioned it just yesterday.
>
> This is not a 3c59x problem. It is due to the APIC forgetting how to
> generate interrupts for a particular IRQ. It happens mostly for NICs
> because they generate a lot of interrupts. I've had it happen just
> once. In that case, _nothing_ would make the interrupt come back
> (including a driver unload/reload).
>
> This gets reported a lot by 3c59x users because this driver specifically
> detects and reports on the problem.

 Hmm, that's interesting. It would be worthwhile to see a dump of APICs'
state when this happens -- maybe an EOI message gets lost for some reason
or an erratum is biting us. There are functions for such kind of
diagnostics already available; they are print_IO_APIC() and
print_all_local_APICs() and may be called on demand by a tiny module, for
example.

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:23 EST