Re: [PATCH v9] zswap: replace RB tree with xarray

From: Yosry Ahmed
Date: Tue Mar 26 2024 - 14:49:49 EST


On Tue, Mar 26, 2024 at 11:42 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> On Tue, Mar 26, 2024 at 11:35 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
> >
> > Very deep RB tree requires rebalance at times. That contributes to the
> > zswap fault latencies. Xarray does not need to perform tree rebalance.
> > Replacing RB tree to xarray can have some small performance gain.
> >
> > One small difference is that xarray insert might fail with ENOMEM, while
> > RB tree insert does not allocate additional memory.
> >
> > The zswap_entry size will reduce a bit due to removing the RB node, which
> > has two pointers and a color field. Xarray store the pointer in the
> > xarray tree rather than the zswap_entry. Every entry has one pointer from
> > the xarray tree. Overall, switching to xarray should save some memory, if
> > the swap entries are densely packed.
> >
> > Notice the zswap_rb_search and zswap_rb_insert often followed by
> > zswap_rb_erase. Use xa_erase and xa_store directly. That saves one tree
> > lookup as well.
> >
> > Remove zswap_invalidate_entry due to no need to call zswap_rb_erase any
> > more. Use zswap_free_entry instead.
> >
> > The "struct zswap_tree" has been replaced by "struct xarray". The tree
> > spin lock has transferred to the xarray lock.
> >
> > Run the kernel build testing 5 times for each version, averages:
> > (memory.max=2GB, zswap shrinker and writeback enabled, one 50GB swapfile,
> > 24 HT core, 32 jobs)
> >
> > mm-unstable-4aaccadb5c04 xarray v9
> > user 3548.902 3534.375
> > sys 522.232 520.976
> > real 202.796 200.864
> >
> > Signed-off-by: Chris Li <chrisl@xxxxxxxxxx>
>
> I remove the previous review tags because I like to get some review of
> the conflict resolution as well.
[..]
> > @@ -1624,20 +1562,14 @@ bool zswap_load(struct folio *folio)
> > pgoff_t offset = swp_offset(swp);
> > struct page *page = &folio->page;
> > bool swapcache = folio_test_swapcache(folio);
> > - struct zswap_tree *tree = swap_zswap_tree(swp);
> > + struct xarray *tree = swap_zswap_tree(swp);
> > struct zswap_entry *entry;
> > u8 *dst;
> >
> > VM_WARN_ON_ONCE(!folio_test_locked(folio));
> >
> > - spin_lock(&tree->lock);
> > - entry = zswap_rb_search(&tree->rbroot, offset);
> > - if (!entry) {
> > - spin_unlock(&tree->lock);
> > - return false;
> > - }
> > /*
> > - * When reading into the swapcache, invalidate our entry. The
> > + * When reading into the swapcache, erase our entry. The
> > * swapcache can be the authoritative owner of the page and
> > * its mappings, and the pressure that results from having two
> > * in-memory copies outweighs any benefits of caching the
> > @@ -1649,8 +1581,12 @@ bool zswap_load(struct folio *folio)
> > * the fault fails. We remain the primary owner of the entry.)
> > */
> > if (swapcache)
> > - zswap_rb_erase(&tree->rbroot, entry);
> > - spin_unlock(&tree->lock);
> > + entry = xa_erase(tree, offset);
> > + else
> > + entry = xa_load(tree, offset);
>
> This is the place I make the modification for the conflict resolution.
> It depends on the swapcache to execute xa_erase() or xa_load().
> Obviously, the xa_load() will not delete the entry from the tree.

The conflict resolution LGTM. If this is the only change from v8 then:

Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>