Re: Linux & ECC memory

Rob Hagopian (
Sat, 16 Nov 1996 20:34:53 -0500

>>From asm/i386/kernel/traps.c (w/ the indents shrunk to fit in 80 cols):
>asmlinkage void do_nmi(struct pt_regs * regs, long error_code)
> smp_flush_tlb_rcv();
> printk("Uhhuh. NMI received. Dazed and confused, but trying to continue\n");
> printk("You probably have a hardware problem with your RAM chips or a\n");
> printk("power saving mode enabled.\n");

1) Couldn't this code determine the active process and remove the active
memory pages from use? (although you'd definatly want this as an option...
I see that some powersaving modes also generate an NMI!)

2) This, of course, wouldn't be good if it happened in the kernel space, at
which point it should panic.

3) On that note, we all know and love the kernel processes (nfsiod,
bdflush), what about one that periodicly did a checksum of the kernel code
region (and itself)? This wouldn't work if there's any self modifying code
in the kernel though... :-( A damaged kernel could conceivably be worse
than a panic.

Of course, given the frequency that this occurs, I don't know how much
effort and kernel bloat should be dedicated to it.
-Rob H.