[PATCH v2 0/2] mm/zswap: optimize for dynamic zswap_pools

From: Chengming Zhou
Date: Wed Feb 14 2024 - 03:55:23 EST




---
Changes in v2:
- fix build error when !CONFIG_MEMCG_KMEM.
- make zswap struct static and fix some error paths, per Yosry.
- add another shrink_lock to protect zswap.next_shrink, per Yosry.
- keep "WARN_ON(percpu_ref_tryget(&pool->ref))" in pool release path
for debug, per Nhat.
- improve the commit messages.
- Link to v1: https://lore.kernel.org/r/20240210-zswap-global-lru-v1-0-853473d7b0da@xxxxxxxxxxxxx

Dynamic pool creation has been supported for a long time, which maybe
not used so much in practice. But with the per-memcg lru merged, the
current structure of zswap_pool's lru and shrinker become less optimal.

In the current structure, each zswap_pool has its own lru, shrinker and
shrink_work, but only the latest zswap_pool will be the current used.

1. When memory has pressure, all shrinkers of zswap_pools will try to
shrink its lru list, there is no order between them.

2. When zswap limit hit, only the last zswap_pool's shrink_work will
try to shrink its own lru, which is inefficient.

A more natural way is to have a global zswap lru shared between all
zswap_pools, and so is the shrinker. The code becomes much simpler too.

Another optimization is changing zswap_pool kref to percpu_ref, which
will be taken reference by every zswap entry. So the scalability is
better.

Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
(zswap shrinker and writeback enabled with one 50GB swapfile,
on a 128 CPUs x86-64 machine, below is the average of 5 runs)

mm-unstable zswap-global-lru
real 63.20 63.12
user 1061.75 1062.95
sys 268.74 264.44

To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
To: Johannes Weiner <hannes@xxxxxxxxxxx>
To: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
To: Nhat Pham <nphamcs@xxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>

---
Chengming Zhou (2):
mm/zswap: global lru and shrinker shared by all zswap_pools
mm/zswap: change zswap_pool kref to percpu_ref

mm/zswap.c | 201 ++++++++++++++++++++++++++-----------------------------------
1 file changed, 87 insertions(+), 114 deletions(-)
---
base-commit: 191d97734e41a5c9f90a2f6636fdd335ae1d435d
change-id: 20240210-zswap-global-lru-94d49316178b

Best regards,
--
Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>