Re: Misleading OOM messages

From: Dave Hansen
Date: Thu May 14 2009 - 16:39:08 EST


On Thu, 2009-05-14 at 15:46 -0400, Christoph Lameter wrote:
> On Thu, 14 May 2009, Pavel Machek wrote:
> > It can be 'low on memory' if you play with mlock() a bit.
>
> But that is a reclaim failure becuase of mlocking pages.
>
> > It is out of memory if you run out of swap (or have no swap to begin with).
>
> That is a swap config issue.

The other thing that I find confusing myself is that we're almost never
at '0 pages free' (which is what I intrinsically think) when we OOM.
We're just under the watermarks and not apparently making any progress.
But I don't think we want to say "under the watermarks" in our error
message.

> > I believe message is often correct. What message would you suggest?
>
> "Failure to reclaim memory"

The problem I have with that is that it also doesn't tell the whole.
story. It's the end symptom when *just* before we OOM, but it doesn't
characterize the whole thing very well. It's like saying the Titanic
sunk because "too much water onboard." :) It's true, but it
concentrates a bit too much on the end state.

To me, it's a question of how much information we can get out in a line
or two on the console. Is something like this better?

"Unable to satisfy memory allocation request and not making
progress reclaiming from other sources."

We can't exactly go spitting out an entire tutorial in dmesg, but could
we stick a short URL in there? Like http://linux-mm.org/OOM perhaps?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/