Re: [PATCH RFC 1/1] mm/swap: queue reclaimable folio to local rotate batch when !folio_test_lru()

From: Yosry Ahmed
Date: Tue Feb 13 2024 - 03:50:27 EST


On Fri, Feb 9, 2024 at 4:00 AM <chengming.zhou@xxxxxxxxx> wrote:
>
> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> All LRU move interfaces have a problem that it has no effect if the
> folio is isolated from LRU (in cpu batch or isolated by shrinker).
> Since it can't move/change folio LRU status when it's isolated, mostly
> just clear the folio flag and do nothing in this case.
>
> In our case, a written back and reclaimable folio won't be rotated to
> the tail of inactive list, since it's still in cpu lru_add batch. It
> may cause the delayed reclaim of this folio and evict other folios.
>
> This patch changes to queue the reclaimable folio to cpu rotate batch
> even when !folio_test_lru(), hoping it will likely be handled after
> the lru_add batch which will put folio on the LRU list first, so
> will be rotated to the tail successfully when handle rotate batch.

It seems to me that it is totally up to chance whether the lru_add
batch is handled first, especially that there may be problems if it
isn't.

>
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> ---
> mm/swap.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index cd8f0150ba3a..d304731e47cf 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -236,7 +236,8 @@ static void folio_batch_add_and_move(struct folio_batch *fbatch,
>
> static void lru_move_tail_fn(struct lruvec *lruvec, struct folio *folio)
> {
> - if (!folio_test_unevictable(folio)) {
> + if (!folio_test_locked(folio) && !folio_test_dirty(folio) &&
> + !folio_test_unevictable(folio) && !folio_test_active(folio)) {

What are these conditions based on? I assume we want to check if the
folio is locked because we no longer check that it is on the LRUs, so
we want to check that no one else is operating on it, but I am not
sure that's enough.

> lruvec_del_folio(lruvec, folio);
> folio_clear_active(folio);
> lruvec_add_folio_tail(lruvec, folio);
> @@ -254,7 +255,7 @@ static void lru_move_tail_fn(struct lruvec *lruvec, struct folio *folio)
> void folio_rotate_reclaimable(struct folio *folio)
> {
> if (!folio_test_locked(folio) && !folio_test_dirty(folio) &&
> - !folio_test_unevictable(folio) && folio_test_lru(folio)) {
> + !folio_test_unevictable(folio) && !folio_test_active(folio)) {

I am not sure it is safe to continue with a folio that is not on the
LRUs. It could be isolated for other purposes, and we end up adding it
to an LRU nonetheless. Also, folio_batch_move_lru() will do
folio_test_clear_lru() and skip such folios anyway. There may also be
messed up accounting, for example lru_move_tail_fn() calls
lruvec_del_folio(), which does some bookkeeping, at least for the
MGLRU case.

> struct folio_batch *fbatch;
> unsigned long flags;
>
> --
> 2.40.1
>