Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup

From: Oleg Nesterov
Date: Tue Jul 12 2016 - 12:52:01 EST


On 07/12, Konstantin Khlebnikov wrote:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> balance_callback(rq);
> preempt_enable();
>
> - if (current->set_child_tid)
> - put_user(task_pid_vnr(current), current->set_child_tid);
> + if (current->set_child_tid &&
> + put_user(task_pid_vnr(current), current->set_child_tid))
> + force_sig(SIGSEGV, current);
> }
>
> Add Oleg into CC. IIRR he had some ideas how to fix this. =)

Heh. OK, OK, thank you Konstantin ;)

I'll try to recall tomorrow, but iirc I only have some ideas of how
we can happily blame the FAULT_FLAG_USER logic.

d, in this particular case, perhaps glibc/set_child_tid too because
(again, iirc) it would nice to simply kill it, it is only used for
some sanity checks...

Oleg.