Re: Re: [RFC PATCH v1] mm: oom: introduce cpuset oom

From: Gang Li
Date: Sun Sep 25 2022 - 23:38:34 EST



On 2022/9/23 03:18, David Rientjes wrote:
On Wed, 21 Sep 2022, Gang Li wrote:

cpuset confine processes to processor and memory node subsets.
When a process in cpuset triggers oom, it may kill a completely
irrelevant process on another numa node, which will not release any
memory for this cpuset.

It seems that `CONSTRAINT_CPUSET` is not really doing much these
days. Using CONSTRAINT_CPUSET, we can easily achieve node aware oom
killing by selecting victim from the cpuset which triggers oom.

Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
Signed-off-by: Gang Li <ligang.bdlg@xxxxxxxxxxxxx>

Hmm, is this the right approach?

If a cpuset results in a oom condition, is there a reason why we'd need to
find a process from within that cpuset to kill? I think the idea is to
free memory on the oom set of nodes (cpuset.mems) and that can happen by
killing a process that is not a member of this cpuset.

Hi,

My last patch implemented this idea[1][2]. But it needs to inc/dec a per mm_struct counter on every page allocation/release/migration.

As the Unixbench show, this takes 0%-3% performance loss on different workloads[2]. So Michal Hocko inspired me to use cpuset[3].

[1]. https://lore.kernel.org/all/20220512044634.63586-1-ligang.bdlg@xxxxxxxxxxxxx/
[2]. https://lore.kernel.org/all/20220708082129.80115-1-ligang.bdlg@xxxxxxxxxxxxx/
[3]. https://lore.kernel.org/all/YoJ%2FioXwGTdCywUE@xxxxxxxxxxxxxx/

I understand the challenges of creating a NUMA aware oom killer to target
memory that is actually resident on an oom node, but this approach doesn't
seem right and could actually lead to pathological cases where a small
process trying to fork in an otherwise empty cpuset is repeatedly oom
killing when we'd actually prefer to kill a single large process.


I think there are three ways to achieve NUMA aware oom killer:

1. Count every page operations, which cause performance loss[2].
2. Iterate over pages(like show_numa_map) for all processes, which may stuck oom.
3. Select victim in a cpuset, which may leads to pathological kill.(this patch)

None of them are perfect and I'm getting stuck, do you have any ideas?