Re: Linux 2.6.38

From: David Rientjes
Date: Mon Mar 14 2011 - 23:13:52 EST


This kernel includes a broken commit that was merged a couple of hours
before release:

commit dc1b83ab08f1954335692cdcd499f78c94f4c42a
Author: Oleg Nesterov <oleg@xxxxxxxxxx>
Date: Mon Mar 14 20:05:30 2011 +0100

oom: oom_kill_process: fix the child_points logic

oom_kill_process() starts with victim_points == 0. This means that
(most likely) any child has more points and can be killed erroneously.

Also, "children has a different mm" doesn't match the reality, we should
check child->mm != t->mm. This check is not exactly correct if t->mm ==
NULL but this doesn't really matter, oom_kill_task() will kill them
anyway.

Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
too.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

As a result of this change, the oom killer will no longer attempt to
sacrifice a child of the selected process in favor of the parent unless
its memory usage exceeds the parent (and this will be an unreachable state
once oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch is merged
from -mm).

This means systems running a webserver, for example, will kill the
webserver itself in oom conditions and not one of its threads serving a
connection; simply forking too many client connections in this scenario
would lead to an oom condition that would kill the server instead of one
of its threads.

Admins who find this behavior to cause disruptions in service should apply
the following revert.

Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -458,10 +458,10 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
struct mem_cgroup *mem, nodemask_t *nodemask,
const char *message)
{
- struct task_struct *victim;
+ struct task_struct *victim = p;
struct task_struct *child;
- struct task_struct *t;
- unsigned int victim_points;
+ struct task_struct *t = p;
+ unsigned int victim_points = 0;

if (printk_ratelimit())
dump_header(p, gfp_mask, order, mem, nodemask);
@@ -487,15 +487,10 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
* parent. This attempts to lose the minimal amount of work done while
* still freeing memory.
*/
- victim_points = oom_badness(p, mem, nodemask, totalpages);
- victim = p;
- t = p;
do {
list_for_each_entry(child, &t->children, sibling) {
unsigned int child_points;

- if (child->mm == t->mm)
- continue;
/*
* oom_badness() returns 0 if the thread is unkillable
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/