Re: OOM Killer killing whole system

From: Andrew Morton
Date: Fri Jan 20 2006 - 07:10:02 EST


Anton Titov <a.titov@xxxxxxx> wrote:
>
> Yesterday I accidently noticed few OOM killer messages in the system log
> and leaved a console tailing the log for the night. In 6 in the morning
> OOM killer got mad generating 500 lines in the log and 5 minutes later
> system closed the ssh connection and became inresponsive. The guy in the
> datacenter told me that when he attached keyboard even caps lock was not
> working. Inspite of this the system still was responsive (only to) ping.
>
> The strange thing is this machine is relatively light loaded - now after
> 6 hours being up free shows:
> total used free shared buffers cached
> Mem: 2075468 1148564 926904 0 123472 314516
> -/+ buffers/cache: 710576 1364892
> Swap: 1004020 0 1004020
>
> Load average stays under 0.5 most of the time. In 6 in the morning it
> should be almost no load (there is no crons scheduled at that time).
>
> I'm attaching messages from the log and my .config.

What kernel version? <looks in config.gz>. 2.6.15.


> Jan 15 06:05:09 vip Normal free:3700kB min:3756kB low:4692kB high:5632kB active:9964kB inactive:8532kB present:901120kB pages_scanned:19628

Pretty much all of the ZONE_NORMAL memory is AWOL.

> Jan 15 06:05:09 vip 216477 pages slab

It's all in slab. 800MB.

I'd be suspecting a slab memory leak. If it happens again, please take a
copy of /proc/slabinfo, send it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/