Re: [PATCH 2/2] Hardware error record persistent support

From: Andrew Morton
Date: Fri Nov 19 2010 - 15:02:20 EST


On Fri, 19 Nov 2010 07:52:08 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Fri, Nov 19, 2010 at 12:10 AM, Huang Ying <ying.huang@xxxxxxxxx> wrote:
> > Normally, corrected hardware error records will go through the kernel
> > processing and be logged to disk or network finally. __But for
> > uncorrected errors, system may go panic directly for better error
> > containment, disk or network is not usable in this half-working
> > system. __To avoid losing these valuable hardware error records, the
> > error records are saved into some kind of simple persistent storage
> > such as flash before panic, so that they can be read out after system
> > reboot successfully.
>
> I think this is totally the wrong thing to do. TOTALLY.
>
> The fact is, concentrating about "hardware errors" makes this
> something that I refuse to merge. It's such an idiotic approach that
> it's disgusting.
>
> Now, if this was designed to be a "hardware-backed persistent 'printk'
> buffer", and was explicitly meant to save not just some special
> hardware error, but catch all printk's (which may be due to hardware
> errors or oopses or warnings or whatever), that would be useful.
>
> But limiting it to just some special source of errors makes this
> pointless and not ever worth merging.
>

yep. We already have bits and pieces in place for this: kmsg_dump,
ramoops, mtdoops, etc. If your hardware has a non-volatile memory then
just hook it up as a backend driver for kmsg_dump.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/