Suggestion: how to deal with unloggable kernel death.

D.A.B. Niggemann (dabn100@hermes.cam.ac.uk)
Sun, 7 Apr 1996 17:16:06 +0100 (BST)


Hi,
Here's a suggestion I cooked up in my spare time to probably help debug
some of the more hideous lockups that happen in Linux (but I don't
have the time to implemnt this myself)
Basically, those problems that persist longest are:
not-easily reproducible lockups or Ooopses ocurring on heavily-loaded
or long-uptime machines, of the sort used for server applications on the
Net.
Now one of the major problems with these errors is: If the Oops or
whatever id _big_ i.e. more than 1 screenful, it scrolls off the console.
So what- it goes into the logs.... Nope- if it's a big enough problem,
the disk subsystem will be hosed, or maybe sys(k)logd will never see it
because the kernel has crashed enough to no longer schedule (panic or worse)
In this case, redirection the errors to another machine is useless too,
especially if the crash has wrecked the networking subsytem.
The fact that X + disk might crash as well doesn't help: we can't get
back to the console then anyway.
So we need something that will persistently display console messages even
if most of the kernel has undergone meltdown...
The existence of the serial console patches has given me an idea in this
direction: This driver would be fine for logging to a terminal (but how
many people have them? And that won't help if we have a message longer
than the terminal's screen. Also, if it uses serial ports, if it operates
using interrupts we are again stuffed if the interrupt handler has been
nuked. I think it uses polled access though... )
So here's the suggestion:
A polled mode _parallel_ console driver that does not attempt to schedule
in the polling loop (and does minimal, i.e. 'throw away character if we
cant write it' error checking!
Result: just hook an el-cheapo dot-matrix printer to your box, load it
with a few 1000 sheets of fanfold, and wait for that next devious crash to
roll in. Obviously won't catch those crashes which won't generate any
messages at all, but otherwise, fairly foolproof.
Only disadvantages: can't coexist with the normal lp driver on the same port.
Shaves a bit of performance off printks...
Needs a 'dumb' enough printer to work (i.e one you can cat plain ascii to
and expect it to print that- short of an LF at the end, and insertion of
lfs after 80 chars, HP style page printers would be fine too)
Anyone buy it?
Dirk
__________________________________________________________

| / _ \ _ _| __ \ Dirk Niggemann
' / | | | | | Jesus College
. \ __ < | | | Cambridge, CB58BL
_|\_\_| \_\___|____/ dabn100@cam.ac.uk
__________________________________________________________