Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space

From: Michal Hocko
Date: Tue Nov 21 2023 - 08:00:27 EST


On Tue 21-11-23 17:06:24, Liu Shixin wrote:
> When spaces of swap devices are exhausted, only file pages can be
> reclaimed. But there are still some swapcache pages in anon lru list.
> This can lead to a premature out-of-memory.
>
> The problem is found with such step:
>
> Firstly, set a 9MB disk swap space, then create a cgroup with 10MB
> memory limit, then runs an program to allocates about 15MB memory.
>
> The problem occurs occasionally, which may need about 100 times [1].
>
> Fix it by checking number of swapcache pages in can_reclaim_anon_pages().
> If the number is not zero, return true and set swapcache_only to 1.
> When scan anon lru list in swapcache_only mode, non-swapcache pages will
> be skipped to isolate in order to accelerate reclaim efficiency.
>
> However, in swapcache_only mode, the scan count still increased when scan
> non-swapcache pages because there are large number of non-swapcache pages
> and rare swapcache pages in swapcache_only mode, and if the non-swapcache
> is skipped and do not count, the scan of pages in isolate_lru_folios() can
> eventually lead to hung task, just as Sachin reported [2].

I find this paragraph really confusing! I guess what you meant to say is
that a real swapcache_only is problematic because it can end up not
making any progress, correct?

AFAIU you have addressed that problem by making swapcache_only anon LRU
specific, right? That would be certainly more robust as you can still
reclaim from file LRUs. I cannot say I like that because swapcache_only
is a bit confusing and I do not think we want to grow more special
purpose reclaim types. Would it be possible/reasonable to instead put
swapcache pages on the file LRU instead?
--
Michal Hocko
SUSE Labs