[3.1-rc4] Bus Fatal Error caused by "PCI: Set PCI-E Max PayloadSize on fabric"

From: Simon Kirby
Date: Tue Sep 06 2011 - 13:36:16 EST


Hello!

Since trying 3.1-rc4 on a few Dell servers, all of them have booted up
with the amber error LED lit. "ipmitool sel list" shows:

1 | 09/06/2011 | 17:21:56 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted
2 | 09/06/2011 | 17:25:38 | Critical Interrupt #0x18 | Bus Fatal Error | Asserted
3 | 09/06/2011 | 17:25:38 | Unknown #0x1a |
4 | 09/06/2011 | 17:25:38 | Unknown #0x1a |

I bisected this to:

b03e7495a862b028294f59fc87286d6d78ee7fa1 is the first bad commit
commit b03e7495a862b028294f59fc87286d6d78ee7fa1
Author: Jon Mason <mason@xxxxxxxx>
Date: Wed Jul 20 15:20:54 2011 -0500

PCI: Set PCI-E Max Payload Size on fabric

It sounds like this has caused other problems as well: http://www.spinics.net/lists/linux-scsi/msg54464.html

In this case, the 6 or so boxes I've see the issue on are all PowerEdge 2950 servers.

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/