Re: [PATCH] mm, oom: Add lru_add_drain() in __oom_reap_task_mm()

From: Michal Hocko
Date: Wed Jan 10 2024 - 03:47:07 EST


On Tue 09-01-24 01:15:11, Jianfeng Wang wrote:
> The oom_reaper tries to reclaim additional memory owned by the oom
> victim. In __oom_reap_task_mm(), it uses mmu_gather for batched page
> free. After oom_reaper was added, mmu_gather feature introduced
> CONFIG_MMU_GATHER_NO_GATHER (in 'commit 952a31c9e6fa ("asm-generic/tlb:
> Introduce CONFIG_HAVE_MMU_GATHER_NO_GATHER=y")', an option to skip batched
> page free. If set, tlb_batch_pages_flush(), which is responsible for
> calling lru_add_drain(), is skipped during tlb_finish_mmu(). Without it,
> pages could still be held by per-cpu fbatches rather than be freed.
>
> This fix adds lru_add_drain() prior to mmu_gather. This makes the code
> consistent with other cases where mmu_gather is used for freeing pages.

Does this fix any actual problem or is this pure code consistency thing?
I am asking because it doesn't make much sense to me TBH, LRU cache
draining is usually important when we want to ensure that cached pages
are put to LRU to be dealt with because otherwise the MM code wouldn't
be able to deal with them. OOM reaper doesn't necessarily run on the
same CPU as the oom victim so draining on a local CPU doesn't
necessarily do anything for the victim's pages.

While this patch is not harmful I really do not see much point in adding
the local draining here. Could you clarify please?

> Signed-off-by: Jianfeng Wang <jianfeng.w.wang@xxxxxxxxxx>
> ---
> mm/oom_kill.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 9e6071fde34a..e2fcf4f062ea 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -537,6 +537,7 @@ static bool __oom_reap_task_mm(struct mm_struct *mm)
> struct mmu_notifier_range range;
> struct mmu_gather tlb;
>
> + lru_add_drain();
> mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
> mm, vma->vm_start,
> vma->vm_end);
> --
> 2.42.1
>

--
Michal Hocko
SUSE Labs