Re: [PATCH] zsmalloc: move LRU update from zs_map_object() to zs_malloc()

From: Sergey Senozhatsky
Date: Tue May 09 2023 - 20:39:37 EST


On (23/05/05 11:50), Nhat Pham wrote:
[..]
> zswap_frontswap_store() shrink_worker()
> zs_malloc() zs_zpool_shrink()
> spin_lock(&pool->lock) zs_reclaim_page()
> zspage = find_get_zspage()
> spin_unlock(&pool->lock)
> spin_lock(&pool->lock)
> zspage = list_first_entry(&pool->lru)
> list_del(&zspage->lru)
> zspage->lru.next = LIST_POISON1
> zspage->lru.prev = LIST_POISON2
> spin_unlock(&pool->lock)
> zs_map_object()
> spin_lock(&pool->lock)
> if (!list_empty(&zspage->lru))
> list_del(&zspage->lru)
> CHECK_DATA_CORRUPTION(next == LIST_POISON1) /* BOOM */
>
> With the current upstream code, this issue rarely happens. zswap only
> triggers writeback when the pool is already full, at which point all
> further store attempts are short-circuited. This creates an implicit
> pseudo-serialization between reclaim and store. I am working on a new
> zswap shrinking mechanism, which makes interleaving reclaim and store
> more likely, exposing this bug.
>
> zbud and z3fold do not have this problem, because they perform the LRU
> list update in the alloc function, while still holding the pool's lock.
> This patch fixes the aforementioned bug by moving the LRU update back to
> zs_malloc(), analogous to zbud and z3fold.
>
> Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Signed-off-by: Nhat Pham <nphamcs@xxxxxxxxx>

Reviewed-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx>