Re: A true story of a crash.

Rik van Riel (H.H.vanRiel@phys.uu.nl)
Sun, 16 Aug 1998 10:19:33 +0200 (CEST)


On Sun, 16 Aug 1998, Albert D. Cahalan wrote:

> How about the following steps, in order?
>
> 1. optional SIGWARN, or whatever it is AIX sends

Sounds good. Number-crunching folks will want to code in
a handler for this.

> 2. optional removal from scheduler

What good is this? Removing it from the run queue doesn't do
us _any_ good, besides, it makes killing the process a bit
harder.

> 3. optional (highly recommended) process killer
> 4. mandatory oops, kill everything, and reboot

Make that some sort of panic, but slightly better:
- SIGTERM everything
- loop 3 times to be sure
- try syslog :)
- SIGKILL everything
- try to sync disks
- umount what can be umounted
- reboot

> For the SIGWARN, glibc could return freed memory to the OS.
> (would need to be disabled for init and other critical things)
>
> The process killer alone is simple though, and already written.

The code for determining when we're out of memory is almost
ready too. Claus Fischer made up some heuristics and I am
trying to find out the gory details of how to find the number
of [insert page type here] the system has.
(swap cache?, buffermem, free pages, cached pages that can be
freed, pagetable cache? ; the entries marked with ? seem to be
architecture-dependant and maybe not even useful)

Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html