Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry

From: Zhongkun He
Date: Wed Jan 10 2024 - 22:49:21 EST


>
> This sounds dangerous. This is going to introduce a rather large
> unexpected side effect - we're changing the readahead behavior in a
> seemingly small zswap optimization. In fact, I'd argue that if we do
> this, the readahead behavior change will be the "main effect", and the
> zswap-side change would be a "happy consequence". We should run a lot
> of benchmarking and document the change extensively if we pursue this
> route.
>

I agree with the unexpected side effect, and here I need
to clarify the original intention of this patch.Please see the memory
offloading steps below.


memory zswap(reclaim) memory+swap (writeback)
1G 0.5G 1G(tmp memory) + 1G(swap)

If the decompressed memory cannot be released in time,
zswap's writeback has great side effects(mostly clod pages).
On the one hand, the memory space has not been reduced,
but has increased (from 0.5G->1G).
At the same time, it is not put the pages to the tail of the lru.
When the memory is insufficient, other pages will be squeezed out
and released early.
With this patch, we can put the tmp pages to the tail and reclaim it
in time when the memory is insufficient or actively reclaimed.
So I think this patch makes sense and hope it can be fixed with a
suitable approaches.

>
> Unless some page flag/readahead expert can confirm that the first
> option is safe, my vote is on this option. I mean, it's fairly minimal
> codewise, no? Just a bunch of plumbing. We can also keep the other
> call sites intact if we just rename the old versions - something along
> the line of:
>
> __read_swap_cache_async_head(..., bool add_to_lru_head)
> {
> ...
> if (add_to_lru_head)
> folio_add_lru(folio)
> else
> folio_add_lru_tail(folio);
> }
>
> __read_swap_cache_async(...)
> {
> return __read_swap_cache_async_tail(..., true);
> }
>
> A bit boilerplate? Sure. But this seems safer, and I doubt it's *that*
> much more work.
>

Yes, agree. I will try it again.

> > >
> > > I have the same idea and also find it intrusive. I think the first solution
> > > is very good and I will try it. If it works, I will send the next version.
> >
> > One way to avoid introducing folio_lru_add_tail() and blumping a
> > boolean from zswap is to have a per-task context (similar to
> > memalloc_nofs_save()), that makes folio_add_lru() automatically add
> > folios to the tail of the LRU. I am not sure if this is an acceptable
> > approach though in terms of per-task flags and such.
>
> This seems a bit hacky and obscure, but maybe it could work.