Re: [PATCH] oom: consider multi-threaded tasks in task_will_free_mem

From: Michal Hocko
Date: Wed Apr 13 2016 - 09:09:08 EST


On Wed 13-04-16 20:04:54, Tetsuo Handa wrote:
> On 2016/04/12 18:19, Michal Hocko wrote:
[...]
> > Hi,
> > I hope I got it right but I would really appreciate if Oleg found some
> > time and double checked after me. The fix is more cosmetic than anything
> > else but I guess it is worth it.
>
> I don't know what
>
> fatal_signal_pending() can be true because of SIGNAL_GROUP_COREDUMP so
> out_of_memory() and mem_cgroup_out_of_memory() shouldn't blindly trust it.
>
> in commit d003f371b270 is saying (how SIGNAL_GROUP_COREDUMP can make
> fatal_signal_pending() true when fatal_signal_pending() is defined as

I guess this is about zap_process() but Olge would be more appropriate
to clarify. Anyway I fail to see how this is realted to this particular
patch.

[...]

> > diff --git a/include/linux/oom.h b/include/linux/oom.h
> > index 628a43242a34..b09c7dc523ff 100644
> > --- a/include/linux/oom.h
> > +++ b/include/linux/oom.h
> > @@ -102,13 +102,24 @@ extern struct task_struct *find_lock_task_mm(struct task_struct *p);
> >
> > static inline bool task_will_free_mem(struct task_struct *task)
> > {
> > + struct signal_struct *sig = task->signal;
> > +
> > /*
> > * A coredumping process may sleep for an extended period in exit_mm(),
> > * so the oom killer cannot assume that the process will promptly exit
> > * and release memory.
> > */
> > - return (task->flags & PF_EXITING) &&
> > - !(task->signal->flags & SIGNAL_GROUP_COREDUMP);
> > + if (sig->flags & SIGNAL_GROUP_COREDUMP)
> > + return false;
> > +
> > + if (!(task->flags & PF_EXITING))
> > + return false;
> > +
> > + /* Make sure that the whole thread group is going down */
> > + if (!thread_group_empty(task) && !(sig->flags & SIGNAL_GROUP_EXIT))
> > + return false;
>
> The whole thread group is going down does not mean we make sure that
> we will send SIGKILL to other thread groups sharing the same memory which
> is possibly holding mmap_sem for write, does it?

And the patch description doesn't say anything about processes sharing
mm. This is supposed to be a minor fix of an obviously suboptimal
behavior of task_will_free_mem. Can we stick to the proposed patch,
please?

If we really do care about processes sharing mm _that_much_ then it
should be handled in the separate patch.
--
Michal Hocko
SUSE Labs