Re: [External] : Re: [PATCH] mm, oom: Add lru_add_drain() in __oom_reap_task_mm()

From: Minchan Kim
Date: Fri Jan 12 2024 - 16:43:13 EST


On Fri, Jan 12, 2024 at 09:49:08AM +0100, Michal Hocko wrote:
> On Thu 11-01-24 16:08:57, Jianfeng Wang wrote:
> >
> >
> > On 1/11/24 1:54 PM, Andrew Morton wrote:
> > > On Thu, 11 Jan 2024 10:54:45 -0800 Jianfeng Wang <jianfeng.w.wang@xxxxxxxxxx> wrote:
> > >
> > >>
> > >>> Unless you can show any actual runtime effect of this patch then I think
> > >>> it shouldn't be merged.
> > >>>
> > >>
> > >> Thanks for raising your concern.
> > >> I'd call it a trade-off rather than "not really correct". Look at
> > >> unmap_region() / free_pages_and_swap_cache() written by Linus. These are in
> > >> favor of this pattern, which indicates that the trade-off (i.e. draining
> > >> local CPU or draining all CPUs or no draining at all) had been made in the
> > >> same way in the past. I don't have a specific runtime effect to provide,
> > >> except that it will free 10s kB pages immediately during OOM.
>
> You are missing an important point. Those two calls are quite different.
> oom_reaper unmaps memory after all the reclaim attempts have failed.
> That includes draining all sorts of caches on the way. Including
> draining LRU pcp cache (look for lru_add_drain_all in the reclaim path).
>
> > > I don't think it's necessary to run lru_add_drain() for each vma. Once
> > > we've done it it once, it can be skipped for additional vmas.
> > >
> > Agreed.
> >
> > > That's pretty minor because the second and successive calls will be
> > > cheap. But it becomes much more significant if we switch to
> > > lru_add_drain_all(), which sounds like what we should be doing here.
> > > Is it possible?
> > >
> > What do you both think of adding lru_add_drain_all() prior to the for loop?
>
> lru_add_drain_all relies on WQs. And we absolutely do not want to get
> oom_reaper stuck just because all the WQ is jammed. So no, this is
> actually actively harmful!

I completely agree. The oom_reap_task_mm function is also used for process_mrelease,
which is a critical path for releasing memory in Android and is typically used
under system pressure(not only for memory pressure but also CPU pressured at the
same time). The lru_add_drain_all function can take a long time to finish because
Android is susceptible to priority inversion among processes.

The better idea may enable remote draining with lru_add_drain_all, analogous to
the recent PCP modifications.