Re: [PATCH memcg 0/1] false global OOM triggered by memcg-limited task

From: Michal Hocko
Date: Tue Oct 19 2021 - 07:54:49 EST


On Tue 19-10-21 13:30:06, Vasily Averin wrote:
> On 19.10.2021 11:49, Michal Hocko wrote:
> > On Tue 19-10-21 09:30:18, Vasily Averin wrote:
> > [...]
> >> With my patch ("memcg: prohibit unconditional exceeding the limit of dying tasks") try_charge_memcg() can fail:
> >> a) due to fatal signal
> >> b) when mem_cgroup_oom -> mem_cgroup_out_of_memory -> out_of_memory() returns false (when select_bad_process() found nothing)
> >>
> >> To handle a) we can follow to your suggestion and skip excution of out_of_memory() in pagefault_out_of memory()
> >> To handle b) we can go to retry: if mem_cgroup_oom() return OOM_FAILED.
>
> > How is b) possible without current being killed? Do we allow remote
> > charging?
>
> out_of_memory for memcg_oom
> select_bad_process
> mem_cgroup_scan_tasks
> oom_evaluate_task
> oom_badness
>
> /*
> * Do not even consider tasks which are explicitly marked oom
> * unkillable or have been already oom reaped or the are in
> * the middle of vfork
> */
> adj = (long)p->signal->oom_score_adj;
> if (adj == OOM_SCORE_ADJ_MIN ||
> test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
> in_vfork(p)) {
> task_unlock(p);
> return LONG_MIN;
> }
>
> This time we handle userspace page fault, so we cannot be kenrel thread,
> and cannot be in_vfork().
> However task can be marked as oom unkillable,
> i.e. have p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN

You are right. I am not sure there is a way out of this though. The task
can only retry for ever in this case. There is nothing actionable here.
We cannot kill the task and there is no other way to release the memory.

--
Michal Hocko
SUSE Labs