Re: [patch 4/7 -mm] oom: badness heuristic rewrite

From: David Rientjes
Date: Thu Feb 11 2010 - 16:51:50 EST


On Thu, 11 Feb 2010, Andrew Morton wrote:

> > Changing any value that may have a tendency to be hardcoded elsewhere is
> > always controversial, but I think the nature of /proc/pid/oom_adj allows
> > us to do so for two specific reasons:
> >
> > - hardcoded values tend not the fall within a range, they tend to either
> > always prefer a certain task for oom kill first or disable oom killing
> > entirely. The current implementation uses this as a bitshift on a
> > seemingly unpredictable and unscientific heuristic that is very
> > difficult to predict at runtime. This means that fewer and fewer
> > applications would hardcode a value of '8', for example, because its
> > semantics depends entirely on RAM capacity of the system to begin with
> > since badness() scores are only useful when used in comparison with
> > other tasks.
>
> You'd be amazed what dumb things applications do. Get thee to
> http://google.com/codesearch?hl=en&lr=&q=[^a-z]oom_adj[^a-z]&sbtn=Search
> and start reading. All 641 matches ;)
>
> Here's one which which writes -16:
> http://google.com/codesearch/p?hl=en#eN5TNOm7KtI/trunk/wlan/vendor/asus/eeepc/init.rc&q=[^a-z]oom_adj[^a-z]&sa=N&cd=70&ct=rc
>
> Let's not change the ABI please.
>

Sigh, this is going to require the amount of system memory to be
partitioned into OOM_ADJUST_MAX, 15, chunks and that's going to be the
granularity at which we'll be able to either bias or discount memory usage
of individual tasks by: instead of being able to do this with 0.1%
granularity we'll now be limited to 100 / 15, or ~7%. That's ~9GB on my
128GB system just because this was originally a bitshift. The upside is
that it's now linear and not exponential.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/