Re: [PATCH v2] mm/oom_kill: show oom eligibility when displaying the current memory state of all tasks

From: Aaron Tomlin
Date: Tue Jun 15 2021 - 07:51:54 EST


On Mon 2021-06-14 08:44 +0200, Michal Hocko wrote:
> Well, I have to say that I have a bit hard time understand the problem
> statement here. First of all you are very likely basing your observation
> on an old kernel which is missing a fix which should make the situation
> impossible IIRC. You should be focusing on a justification why the new
> information is helpful for the current tree.

Michal,

Not exactly.

See oom_reap_task(). Let's assume an OOM event occurred within the context
of a memcg and 'memory.oom.group' was not set. If I understand correctly,
once all attempts to OOM reap the specified task were "unsuccessful" then
MMF_OOM_SKIP is applied; and, the assumption is it will be terminated
shorty due to the pending fatal signal (see __oom_kill_process()) i.e. a
SIGKILL is sent to the "victim" before the OOM reaper is notified. Now
assuming the above task did not exited yet, another task, in the same
memcg, could also trigger an OOM event. Therefore, when showing potential
OOM victims the task above with MMF_OOM_SKIP set, will indeed be displayed.

I understanding the point on OOM_SCORE_ADJ_MIN. This can be easily
identified and is clear to the viewer. However, the same cannot be stated
for MMF_OOM_SKIP.

So, if we prefer to display rather than exclude such tasks, in my opinion
having a flag/or marker of some description might prove useful to avoid any
misunderstanding.

> This should provide an example of the output with a clarification of the
> meaning.

Fair enough.




Kind regards,

--
Aaron Tomlin