RE: 2.6.15.1: persistent nasty hang in sync_page killing NFS(ne2k-pci / DP83815-related?), i686/PIII

From: Roger Heflin
Date: Mon Jan 30 2006 - 14:24:25 EST




> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx
> [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of
> Trond Myklebust
> Sent: Saturday, January 28, 2006 7:59 PM
> To: Nix
> Cc: linux-kernel@xxxxxxxxxxxxxxx; thockin@xxxxxxxxxx
> Subject: Re: 2.6.15.1: persistent nasty hang in sync_page
> killing NFS(ne2k-pci / DP83815-related?), i686/PIII
>
> On Sat, 2006-01-28 at 22:52 +0000, Nix wrote:
>
> > tcpdumps and the kernel's packet counters on both sides show NFS
> > packets flowing, and being retried, over and over again:
>
> Are you using an Intel motherboard? If so, it could be the IPMI bug
>
> http://blogs.sun.com/roller/page/shepler?entry=port_623_or_the_mount
>
> Cheers,
> Trond

Trond, Nix,

There is a *WORSE* bug with the Broadcom network chip based boards
around IPMI if IPMI and linux are using the same ip and mac address.

The broadcom firmware will collect all packets destined for the IPMI
port (which should be fine-except that the broadcom firmware does not
understand IP fragments and collects any fragments whose value where
the port would normally be matches the IPMI port-even though it is an
ip fragment and does not have an associated port number).

I have only so far seen it affect UDP and not TCP, but I have a
nice simple program (supplied by a customer) that will nicely duplicate
the problem (every time), I lose the same single fragment out of a
32k NFS packet each and every time.

I have reported the problem to Broadcom, the only real response out
of anyone is to use separate ip/mac address for IPMI.

Roger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/