Re: machine oops under heavy I/O or network access ???

From: Eric Kamm
Date: Fri Oct 06 2006 - 16:35:40 EST


Eric Kamm wrote:
Hello,

I apologize for the long post. I tried to include as much
relevant information as possible.

Thank you,
Eric


[1.] One line summary of the problem:
machine oops under heavy I/O or network access ???

[2.] Full description of the problem/report:

I have a Supermicro X6DHE-XG2 (Intel E7520 chipset) with dual Xeon with 4GB RAM.
I am using one 3ware 8005 controller, three 3ware 9550sx controllers, and
reiserfs. This machine is acting primarily to serve NFS.

The oops below are from my original machine configuration, but I have tried other some variations and have other oops:

- I have tried Suse 10.1 kernels 2.6.16.13-4-smp and 2.6.16.13-4-default.
- I have rebuilt the system onto an identical spare hardware configuration.
- I have used 64-bit and 32-bit kernel versions.
- I have replaced all the shared reiserfs filesytems with ext3.

In all cases under load, disk IO alone (NFS server off) or disk IO+NFS operations, the system has oops and "general protection fault" errors.

The process is usually nfds or rpc.mountd, but (with the NFS server turned off) I
also have errors from, for example, cp and rsync. The system has been up for as much as 16 hours with no problems, and sometimes as little as five minutes.

(~1000 lines snipped)

Any help/pointers will be greatly appreciated.

- Did I post this to an appropriate place? If not, could you
recommend a place to report this problem?

- Is there something else you can suggest I do to further understand
the problem?

Thank you,
Eric





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/