Re: 2.6.{30,31} x86_64 ahci problem - irq 23: nobody cared

From: Jean Delvare
Date: Wed Oct 21 2009 - 04:38:57 EST


Hi Tejun, Alexander,

Le mardi 13 octobre 2009, Tejun Heo a écrit :
> Alexander Huemer wrote:
> > i compiled gcc in a loop over night, 14 times. no error.
> > it really seams i2c_i801 was the cause...
> > unfortunately i still don't know how i can extract the part of the gcc
> > compilation process that causes the error on an affected kernel.
> > that would enable me to create a simple test program.
>
> Given that i2c is used for temperature monitoring, I think it is not
> triggered by any single step of the compiling but rather by the
> accumulated heat load during compilation. Let's wait for Jean to
> chime in. :-)

OK, here I am, sorry for the delay. I've read the discussion thread.
Here are the few data points I can offer, in the hope it will help:

* While the i2c-i801 driver received some changes in kernel 2.6.30,
none of these are related to PCI nor interrupts. So as the problem
is new in kernel 2.6.30, the i2c-i801 driver alone is unlikely to
cause it. This may, however, be a combination of something i2c-i801
does and something the pci subsystem does since kernel 2.6.30. For
this reason, I would still recommend a bisection if the problem can
be reliably reproduced. I know it takes time, but it is always
easier to fix a bug when we know which commit introduced it.

* The i2c-i801 driver does _not_ make use of interrupts. It is
poll-based (I am not exactly proud of that, but that's the way it
is.)

#define ENABLE_INT9 0 /* set to 0x01 to enable - untested */

So I am very surprised to read that this driver would cause an IRQ
storm.

* One thing the i2c-i801 driver does on the PCI device is:

err = pci_enable_device(dev);

I presume this is what causes the following message in dmesg:

i801_smbus 0000:00:1f.3: PCI INT B -> GSI 23 (level, low) -> IRQ 23

Basically, even though the driver doesn't make use of interrupts,
the IRQ is still registered because this is how the hardware is
setup.

As a conclusion, I suspect that 2 things may be happening: either
the SMBus is triggering interrupts when told not to. The ICH6 is a
bit different from all the other supported chips, I'll double check
if we may have missed something. Or, something else is triggering
SMBus transactions. SMI and ACPI come to mind. If this is the case
then you do not want to use i2c-i801 on this motherboard.

Questions to Alexander :

* Can I please see the output of "sensors" on your system?
* What are the brand and model of your motherboard?
* Can we get an acpidump for your system?

--
Jean Delvare
Suse L3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/