Re: [OT] Linux stability despite unstable hardware

From: Bryan Andersen
Date: Sat May 22 2004 - 12:30:03 EST


Timothy Miller wrote:
I have had some issues recently with memory errors when using aggressive memory timings. Although memory tests pass fine, gcc would tend to crash and would generate incorrect code when compiling other things. Gcc couldn't even build itself properly under those conditions.

The really interesting thing is that the Linux kernel was totally unaffected. Compiling the Linux kernel is often thought of as a stressful thing for a system, yet compiling a kernel with a broken gcc on a system with intermittent memory errors goes through error free, and the kernel is 100% stable when running.

But until the memory errors were fixed, things like KDE wouldn't build without gcc crashing.

So, what is it about Linux that makes it build properly with a broken GCC and run perfectly despite memory errors?

It could just be heat buildup in an critical area when under sustained heavy load. It may take a while for enough heat to build up to cause problems. I just recently found one of these. It would take 4-6 hours of heavy intensive processing before an error would happen. I placed a fan pointing at the motherboard chipset and memory to keep them cooler and the problem seams to have gone away.

For testing I wrote a script that kept compiling the kernel again and agian in a while(true) loop. Effectively a repeat until crash loop. For each compile it saved the stdout/stderr output and diffed it against the first run. Any differnces were flagged for checking latter.

- Bryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/