[PATCH] Fix race between oom kill and task exit

From: Ma, Xindong
Date: Thu Nov 28 2013 - 00:09:49 EST


From: Leon Ma <xindong.ma@xxxxxxxxx>
Date: Thu, 28 Nov 2013 12:46:09 +0800
Subject: [PATCH] Fix race between oom kill and task exit

There is a race between oom kill and task exit. Scenario is:
TASK A TASK B
TASK B is selected to oom kill
in oom_kill_process()
check PF_EXITING of TASK B
task call do_exit()
task set PF_EXITING flag
write_lock_irq(&tasklist_lock);
remove TASK B from thread group in __unhash_process()
write_unlock_irq(&tasklist_lock);
read_lock(&tasklist_lock);
traverse threads of TASK B
read_unlock(&tasklist_lock);

After that, the following traversal of threads in TASK B will not end because TASK B is not in the thread group:
do {
....
} while_each_thread(p, t);

Signed-off-by: Leon Ma <xindong.ma@xxxxxxxxx>
Signed-off-by: xiaobing tu <xiaobing.tu@xxxxxxxxx>
---
mm/oom_kill.c | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 1e4a600..32ec88d 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -412,16 +412,6 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
DEFAULT_RATELIMIT_BURST);

- /*
- * If the task is already exiting, don't alarm the sysadmin or kill
- * its children or threads, just set TIF_MEMDIE so it can die quickly
- */
- if (p->flags & PF_EXITING) {
- set_tsk_thread_flag(p, TIF_MEMDIE);
- put_task_struct(p);
- return;
- }
-
if (__ratelimit(&oom_rs))
dump_header(p, gfp_mask, order, memcg, nodemask);

@@ -437,6 +427,16 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
* still freeing memory.
*/
read_lock(&tasklist_lock);
+ /*
+ * If the task is already exiting, don't alarm the sysadmin or kill
+ * its children or threads, just set TIF_MEMDIE so it can die quickly
+ */
+ if (p->flags & PF_EXITING) {
+ set_tsk_thread_flag(p, TIF_MEMDIE);
+ put_task_struct(p);
+ read_unlock(&tasklist_lock);
+ return;
+ }
do {
list_for_each_entry(child, &t->children, sibling) {
unsigned int child_points;
--
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/