Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space

From: Huang, Ying
Date: Thu Nov 23 2023 - 01:18:20 EST


Michal Hocko <mhocko@xxxxxxxx> writes:

> On Wed 22-11-23 02:39:15, Yosry Ahmed wrote:
>> On Wed, Nov 22, 2023 at 2:09 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>> >
>> > On Wed 22-11-23 09:52:42, Michal Hocko wrote:
>> > > On Tue 21-11-23 22:44:32, Yosry Ahmed wrote:
>> > > > On Tue, Nov 21, 2023 at 10:41 PM Liu Shixin <liushixin2@xxxxxxxxxx> wrote:
>> > > > >
>> > > > >
>> > > > > On 2023/11/21 21:00, Michal Hocko wrote:
>> > > > > > On Tue 21-11-23 17:06:24, Liu Shixin wrote:
>> > > > > >
>> > > > > > However, in swapcache_only mode, the scan count still increased when scan
>> > > > > > non-swapcache pages because there are large number of non-swapcache pages
>> > > > > > and rare swapcache pages in swapcache_only mode, and if the non-swapcache
>> > > > > > is skipped and do not count, the scan of pages in isolate_lru_folios() can
>> > > > > > eventually lead to hung task, just as Sachin reported [2].
>> > > > > > I find this paragraph really confusing! I guess what you meant to say is
>> > > > > > that a real swapcache_only is problematic because it can end up not
>> > > > > > making any progress, correct?
>> > > > > This paragraph is going to explain why checking swapcache_only after scan += nr_pages;
>> > > > > >
>> > > > > > AFAIU you have addressed that problem by making swapcache_only anon LRU
>> > > > > > specific, right? That would be certainly more robust as you can still
>> > > > > > reclaim from file LRUs. I cannot say I like that because swapcache_only
>> > > > > > is a bit confusing and I do not think we want to grow more special
>> > > > > > purpose reclaim types. Would it be possible/reasonable to instead put
>> > > > > > swapcache pages on the file LRU instead?
>> > > > > It looks like a good idea, but I'm not sure if it's possible. I can try it, is there anything to
>> > > > > pay attention to?
>> > > >
>> > > > I think this might be more intrusive than we think. Every time a page
>> > > > is added to or removed from the swap cache, we will need to move it
>> > > > between LRUs. All pages on the anon LRU will need to go through the
>> > > > file LRU before being reclaimed. I think this might be too big of a
>> > > > change to achieve this patch's goal.
>> > >
>> > > TBH I am not really sure how complex that might turn out to be.
>> > > Swapcache tends to be full of subtle issues. So you might be right but
>> > > it would be better to know _why_ this is not possible before we end up
>> > > phising for couple of swapcache pages on potentially huge anon LRU to
>> > > isolate them. Think of TB sized machines in this context.
>> >
>> > Forgot to mention that it is not really far fetched from comparing this
>> > to MADV_FREE pages. Those are anonymous but we do not want to keep them
>> > on anon LRU because we want to age them indepdendent on the swap
>> > availability as they are just dropped during reclaim. Not too much
>> > different from swapcache pages. There are more constrains on those but
>> > fundamentally this is the same problem, no?
>>
>> I agree it's not a first, but swap cache pages are more complicated
>> because they can go back and forth, unlike MADV_FREE pages which
>> usually go on a one way ticket AFAICT.
>
> Yes swapcache pages are indeed more complicated but most of the time
> they just go away as well, no?

When we swapin a page, we will put it in swapcache too. And the page
can be in that state for long time if there is more than 50% free space
in the swap device.

> MADV_FREE can be reinitiated if they are
> written as well. So fundamentally they are not that different.
>

[snip]

--
Best Regards,
Huang, Ying