Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag in zswap_writeback_entry

From: Nhat Pham
Date: Sun Jan 07 2024 - 13:53:30 EST


On Fri, Jan 5, 2024 at 6:10 AM Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx> wrote:
>
> >
> > Hmm originally I was thinking of doing an (unconditional)
> > lru_add_drain() outside of zswap_writeback_entry() - once in
> > shrink_worker() and/or zswap_shrinker_scan(), before we write back any
> > of the entries. Not sure if it would work/help here tho - haven't
> > tested that idea yet.
> >
>
> The pages are allocated by __read_swap_cache_async() in
> zswap_writeback_entry() and it must be newly allocated,
> not cached in swap.
> Please see the code below in zswap_writeback_entry()
>
> page = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol,
> NO_INTERLEAVE_INDEX, &page_was_allocated);
> if (!page) {
> goto fail;}
> /* Found an existing page, we raced with load/swapin */
> if (!page_was_allocated) {
> put_page(page);
> ret = -EEXIST;
> goto fail;
> }
>
> So when it comes to SetPageReclaim(page),
> The page has just been allocated and is still in the percpu batch,
> which has not been added to the LRU.
>
> Therefore,lru_add_drain() did not work outside the
> zswap_writeback_entry()

You're right hmmm.

>
> > >
> > > New test:
> > > This patch will add the execution of folio_rotate_reclaimable(not executed
> > > without this patch) and lru_add_drain,including percpu lock competition.
> > > I bind a new task to allocate memory and use the same batch lock to compete
> > > with the target process, on the same CPU.
> > > context:
> > > 1:stress --vm 1 --vm-bytes 1g (bind to cpu0)
> > > 2:stress --vm 1 --vm-bytes 5g --vm-hang 0(bind to cpu0)
> > > 3:reclaim pages, and writeback 5G zswap_entry in cpu0 and node 0.
> > >
> > > Average time of five tests
> > >
> > > Base patch patch + compete
> > > 4.947 5.0676 5.1336
> > > +2.4% +3.7%
> > > compete means: a new stress run in cpu0 to compete with the writeback process.
> > > PID USER %CPU %MEM TIME+ COMMAND P
> > > 1367 root 49.5 0.0 1:09.17 bash (writeback) 0
> > > 1737 root 49.5 2.2 0:27.46 stress (use percpu
> > > lock) 0
> > >
> > > around 2.4% increase in real time,including the execution of
> > > folio_rotate_reclaimable(not executed without this patch) and lru_add_drain,but
> > > no lock contentions.
> >
> > Hmm looks like the regression is still there, no?
>
> Yes, it cannot be eliminated.

Yeah this solution is not convincing to me. The overall effect of this
patch is we're trading runtime to save some swap usage. That seems
backward to me? Are there any other observable benefits to this?

Otherwise, unless the benchmark is an adversarial workload that is not
representative of the common case (and you'll need to show me some
alternative benchmark numbers or justifications to convince me here),
IMHO this is a risky change to merge. This is not a feature that is
gated behind a flag that users can safely ignore/disable.

>
> >
> > >
> > > around 1.3% additional increase in real time with lock contentions on the same
> > > cpu.
> > >
> > > There is another option here, which is not to move the page to the
> > > tail of the inactive
> > > list after end_writeback and delete the following code in
> > > zswap_writeback_entry(),
> > > which did not work properly. But the pages will not be released first.
> > >
> > > /* move it to the tail of the inactive list after end_writeback */
> > > SetPageReclaim(page);
> >
> > Or only SetPageReclaim on pages on LRU?
>
> No, all the pages are newly allocted and not on LRU.
>
> This patch should add lru_add_drain() directly, remove the
> if statement.
> The purpose of writing back data is to release the page,
> so I think it is necessary to fix it.
>
> Thanks for your time, Nhat.

Is there a way to detect these kinds of folios at
folio_rotate_reclaimable() callsite? i.e if this is a zswap-owned page
etc. etc.?