Re: [PATCH] mm: vmscan: fix extreme overreclaim and swap floods

From: Johannes Weiner
Date: Sat Nov 12 2022 - 17:48:59 EST


On Fri, Aug 12, 2022 at 10:59:12AM +0900, Joonsoo Kim wrote:
> I think that we can fix the issue without breaking the fairness.
> Key idea is that doing scan based on the lru having max scan count.
> (aka max-lru)
> As scan is doing on max-lru, do scan the proportional number of
> pages on other lru.
>
> Pseudo code is here.
>
> 1. find the lru having max scan count
> 2. calculate nr_to_scan_max for max-lru
> 3. prop = (scanned[max-lru] + nr_to_scan_max) / targets[max-lru]

What's nr_to_scan_max?

AFAICS, prop would round down to 0 pretty quickly for imbalanced LRUs,
at which point it would stop reclaiming the smaller list altogether.

> 3. for_each_lru()
> 3-1. nr_to_scan = (targets[lru] * prop) - scanned[lru]
> 3-2. shrink_list(nr_to_scan)
>
> With this approach, we can minimize reclaim without breaking the
> fairness.
>
> Note that actual code needs to handle some corner cases, one of it is
> a low-nr_to_scan case to improve performance.

Right.

The main problem I see is that the discrepancies between LRU sizes can
be many orders bigger than common reclaim goals. Even when one LRU is
just 10x bigger, it'll be difficult to be both fair and still have
efficient batch sizes when the goal is only 32 pages total.

I think a proper way to do fairness would have to track scan history
over multiple cycles. (I think mglru does that, but I can't be sure.)