Re: Panic on OOM

From: David Rientjes
Date: Tue Jun 14 2011 - 20:13:16 EST


On Tue, 14 Jun 2011, Chris Fowler wrote:

> I'm running into a problem in 2.6.38 where the kernel is not doing what
> I'm expecting it to do. I'm guessing that some things have changed and
> that is what it going on.
>
> First, The tune at boot:
>
> f.open("/proc/sys/vm/panic_on_oom", std::ios::out);
> f << "1";
> f.close();
>
> f.open("/proc/sys/kernel/panic", std::ios::out);
> f << "10";
> f.close();
>

Hmm, you don't check that the writes to the sysctls actually succeed?

Using /proc/sys/vm/panic_on_oom also won't panic the machine if you happen
to use a cpuset or mempolicy. You'll want to write '2' instead if you
want to panic in all possible oom conditions.

> I want the kernel to panic on out of memory. I then want it to wait 10s
> before doing a reboot.
>
> This program will consume all memory and make the box unresponsive
>
> #!/usr/bin/perl
>
> my @mem = ()
> while(1) {
> push @mem, "########################";
> }
>
> It does not take long to fill up 1G of space. There is NO swap on this
> device and never will be. I did notice that after a long period of time
> (I've not timed it) I finally do see a panic and I do see "rebooting in
> 10 seconds..." . It does not reboot.
>

Ok, it seems like the oom killer is being called correctly and respecting
your panic_on_oom setting because it is a system-wide oom condition (your
perl script wasn't bound to any cpuset or mempolicy).

So that leaves the panic() not rebooting properly when the timeout is set.
You would only see the "Rebooting in 10 seconds..." message if the write
to /proc/sys/kernel/panic suceeded, and there's this little comment in
kernel/panic.c:

/*
* This will not be a clean reboot, with everything
* shutting down. But if there is a chance of
* rebooting the system it will be rebooted.
*/

with a call to emergency_restart(). You didn't specify your architecture,
but assuming you're using x86 without a hypervisor and didn't specify a
reboot= parameter on the command line, it should suceed although there are
some hardware dependencies. Does using

reboot=force

on the command line help?

Either way, could you send your /proc/cpuinfo and .config?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/