Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space

From: Michal Hocko
Date: Wed Nov 22 2023 - 03:52:49 EST


On Tue 21-11-23 22:44:32, Yosry Ahmed wrote:
> On Tue, Nov 21, 2023 at 10:41 PM Liu Shixin <liushixin2@xxxxxxxxxx> wrote:
> >
> >
> > On 2023/11/21 21:00, Michal Hocko wrote:
> > > On Tue 21-11-23 17:06:24, Liu Shixin wrote:
> > >
> > > However, in swapcache_only mode, the scan count still increased when scan
> > > non-swapcache pages because there are large number of non-swapcache pages
> > > and rare swapcache pages in swapcache_only mode, and if the non-swapcache
> > > is skipped and do not count, the scan of pages in isolate_lru_folios() can
> > > eventually lead to hung task, just as Sachin reported [2].
> > > I find this paragraph really confusing! I guess what you meant to say is
> > > that a real swapcache_only is problematic because it can end up not
> > > making any progress, correct?
> > This paragraph is going to explain why checking swapcache_only after scan += nr_pages;
> > >
> > > AFAIU you have addressed that problem by making swapcache_only anon LRU
> > > specific, right? That would be certainly more robust as you can still
> > > reclaim from file LRUs. I cannot say I like that because swapcache_only
> > > is a bit confusing and I do not think we want to grow more special
> > > purpose reclaim types. Would it be possible/reasonable to instead put
> > > swapcache pages on the file LRU instead?
> > It looks like a good idea, but I'm not sure if it's possible. I can try it, is there anything to
> > pay attention to?
>
> I think this might be more intrusive than we think. Every time a page
> is added to or removed from the swap cache, we will need to move it
> between LRUs. All pages on the anon LRU will need to go through the
> file LRU before being reclaimed. I think this might be too big of a
> change to achieve this patch's goal.

TBH I am not really sure how complex that might turn out to be.
Swapcache tends to be full of subtle issues. So you might be right but
it would be better to know _why_ this is not possible before we end up
phising for couple of swapcache pages on potentially huge anon LRU to
isolate them. Think of TB sized machines in this context.

--
Michal Hocko
SUSE Labs