2.6.7 SMP trouble?

From: Jason Gauthier
Date: Mon Jul 19 2004 - 14:17:52 EST

I've found an IBM netfinity (5600) box that was shelved a few years ago. I
spent $80 and got two processors for it. (P3-667).

I put them in the box, installed Linux (slackware) and upgraded the kernel
to 2.6.7. I then started installing my software on it. Nagios, MRTG,
samba, and some other tools we use for network monitoring. This is going to
be an upgrade to a monitoring server we have. Well, I went home, came in
the next day and the box was locked hard. No messages, no console output.
Just dead.

Thinking it was a fluke, I fired it up. Again, after several hours running;
total death. So, I figured I have two options. Software or hardware is
making it die. I removed each processor in turn, and ran the box for over
24 hours under HIGH stress. (5+ load average). The system is running the
above mentioned software. But, just to make sure this processor gets a
workout I am compiling code over and over. Both processors have been rock
solid for the duration of the test.

I then placed both processors in the box and started the same test. It was
dead within 8 hours. I am now very suspicious of the kernel.

So, I installed 2.4.22 and ran the same tests. It went over 48 hours with
no issues. Now I'm certain it's the kernel. Can anyone confirm any SMP
issues that might cause this?



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/