Re: [RFC PATCH v2 6/7] mm: zswap: simplify writeback function

From: Johannes Weiner
Date: Thu Jun 08 2023 - 12:49:01 EST


Hi Domenico,

On Tue, Jun 06, 2023 at 04:56:10PM +0200, Domenico Cerasuolo wrote:
> Previously, in zswap, the writeback function was passed to zpool drivers
> for their usage in calling the writeback operation. However, since the
> drivers did not possess references to entries and the function was
> specifically designed to work with handles, the writeback process has
> been modified to occur directly within zswap.

I'm having trouble parsing this sentence.

> Consequently, this change allows for some simplification of the
> writeback function, taking into account the updated workflow.

How about:

zswap_writeback_entry() used to be a callback for the backends, which
don't know about struct zswap_entry.

Now that the only user is the generic zswap LRU reclaimer, it can be
simplified: pass the pinned zswap_entry directly, and consolidate the
refcount management in the shrink function.

> Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@xxxxxxxxx>
> ---
> mm/zswap.c | 69 ++++++++++++++----------------------------------------
> 1 file changed, 17 insertions(+), 52 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 2831bf56b168..ef8604812352 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -250,7 +250,8 @@ static bool zswap_has_pool;
> pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \
> zpool_get_type((p)->zpool))
>
> -static int zswap_writeback_entry(struct zpool *pool, unsigned long handle);
> +static int zswap_writeback_entry(struct zswap_entry *entry, struct zswap_header *zhdr,
> + struct zswap_tree *tree);
> static int zswap_pool_get(struct zswap_pool *pool);
> static void zswap_pool_put(struct zswap_pool *pool);
>
> @@ -632,15 +633,21 @@ static int zswap_shrink(struct zswap_pool *pool)
> }
> spin_unlock(&tree->lock);
>
> - ret = zswap_writeback_entry(pool->zpool, lru_entry->handle);
> + ret = zswap_writeback_entry(lru_entry, zhdr, tree);
>
> spin_lock(&tree->lock);
> if (ret) {
> spin_lock(&pool->lru_lock);
> list_move(&lru_entry->lru, &pool->lru);
> spin_unlock(&pool->lru_lock);

This could use a comment.

/* Writeback failed, put entry back on LRU */

> + zswap_entry_put(tree, tree_entry);

This put is a common factor in both branches, so you can consolidate.

> + } else {
> + /* free the local reference */
> + zswap_entry_put(tree, tree_entry);
> + /* free the entry if it's not been invalidated*/

Missing space between text and */

> + if (lru_entry == zswap_rb_search(&tree->rbroot, swpoffset))
> + zswap_entry_put(tree, tree_entry);
> }
> - zswap_entry_put(tree, tree_entry);
> spin_unlock(&tree->lock);

The success path freeing (hopefully the common path) is now
unfortunately hidden in fairly deep indentation. I think this would be
better with a goto for the error case.

All together, something like this?

if (ret) {
/* Writeback failed, put entry back on LRU */
...
goto put_unlock;
}

/*
* Writeback started successfully, the page now belongs to the
* swapcache. Drop the base ref from the tree - unless invalidate
* already took it out while we had the tree->lock released for IO.
*/
if (lru_entry == zswap_rb_search(&tree->rb_root, swpoffset))
zswap_entry_put(tree, entry);

put_unlock:
/* Drop local reference */
zswap_entry_put(tree, tree_entry);
spin_unlock(&tree->lock);

return ret ? -EAGAIN : 0;

Btw, it's unsettling that we drop the tree reference without
explicitly removing the entry for the tree. We rely on the final put
that frees the entry to do tree removal, but this doesn't happen if
somebody else is holding a reference; the entry can remain in the tree
long after we've officially handed over ownership of the data to
swapcache. TTBOMK there are currently no bugs because of that, but
it's a data corruption waiting to happen.

This should be:

/*
* Writeback started successfully, the page now belongs to the
* swapcache. Drop the entry from zswap - unless invalidate already
* took it out while we had the tree->lock released for IO.
*/
if (entry == zswap_rb_search(&tree->rb_root, swpoffset))
zswap_invalidate_entry(tree, entry);

Would you care to send a follow-up patch?