Re: [External] [RFC] Analyzing zpool allocators / Removing zbud and z3fold

From: Zhongkun He
Date: Thu Feb 22 2024 - 01:46:39 EST


On Fri, Feb 9, 2024 at 11:28 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> Hey folks,
>
> This is a follow up on my previously sent RFC patch to deprecate
> z3fold [1]. This is an RFC without code, I thought I could get some
> discussion going before writing (or rather deleting) more code. I went
> back to do some analysis on the 3 zpool allocators: zbud, zsmalloc,
> and z3fold.
>
> [1]https://lore.kernel.org/linux-mm/20240112193103.3798287-1-yosryahmed@xxxxxxxxxx/
>
> In this analysis, for each of the allocators I ran a kernel build test
> on tmpfs in a limit cgroup 5 times and captured:
> (a) The build times.
> (b) zswap_load() and zswap_store() latencies using bpftrace.
> (c) The maximum size of the zswap pool from /proc/meminfo::Zswapped.
>
> Here are the results I have. I am using zsmalloc as the base for all
> comparisons.
>
> -------------------------------- <Results> --------------------------------
>
> (a) Build times
>
> *** zsmalloc ***
> ──────────────────────────────────────────────────────────────
> LABEL │ MIN │ MAX │ MEAN │ MEDIAN │ STDDEV
> ────────────────────┼──────────┼──────────┼──────────┼────────
> real │ 108.890 │ 116.160 │ 111.304 │ 110.310 │ 2.719
> sys │ 6838.860 │ 7137.830 │ 6936.414 │ 6862.160 │ 114.860
> user │ 2838.270 │ 2859.050 │ 2850.116 │ 2852.590 │ 7.388
> ──────────────────────────────────────────────────────────────
>
> *** zbud ***
> ──────────────────────────────────────────────────────────────
> LABEL │ MIN │ MAX │ MEAN │ MEDIAN │ STDDEV
> ────────────────────┼──────────┼──────────┼──────────┼────────
> real │ 105.540 │ 114.430 │ 108.738 │ 108.140 │ 3.027
> sys │ 6553.680 │ 6794.330 │ 6688.184 │ 6661.840 │ 86.471
> user │ 2836.390 │ 2847.850 │ 2842.952 │ 2843.450 │ 3.721
> ──────────────────────────────────────────────────────────────
>
> *** z3fold ***
> ──────────────────────────────────────────────────────────────
> LABEL │ MIN │ MAX │ MEAN │ MEDIAN │ STDDEV
> ────────────────────┼──────────┼──────────┼──────────┼────────
> real │ 113.020 │ 118.110 │ 114.642 │ 114.010 │ 1.803
> sys │ 7168.860 │ 7284.900 │ 7243.930 │ 7265.290 │ 42.254
> user │ 2865.630 │ 2869.840 │ 2868.208 │ 2868.710 │ 1.625
> ──────────────────────────────────────────────────────────────
>
> Comparing the means, zbud is 2.3% faster, and z3fold is 3% slower.
>
> (b) zswap_load() and zswap_store() latencies
>
> *** zsmalloc ***
>
> @load_ns:
> [128, 256) 377 | |
> [256, 512) 772 | |
> [512, 1K) 923 | |
> [1K, 2K) 22141 | |
> [2K, 4K) 88297 | |
> [4K, 8K) 1685833 |@@@@@ |
> [8K, 16K) 17087712 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [16K, 32K) 10875077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
> [32K, 64K) 777656 |@@ |
> [64K, 128K) 127239 | |
> [128K, 256K) 50301 | |
> [256K, 512K) 1669 | |
> [512K, 1M) 37 | |
> [1M, 2M) 3 | |
>
> @store_ns:
> [512, 1K) 279 | |
> [1K, 2K) 15969 | |
> [2K, 4K) 193446 | |
> [4K, 8K) 823283 | |
> [8K, 16K) 14209844 |@@@@@@@@@@@ |
> [16K, 32K) 62040863 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [32K, 64K) 9737713 |@@@@@@@@ |
> [64K, 128K) 1278302 |@ |
> [128K, 256K) 487285 | |
> [256K, 512K) 4406 | |
> [512K, 1M) 117 | |
> [1M, 2M) 24 | |
>
> *** zbud ***
>
> @load_ns:
> [128, 256) 452 | |
> [256, 512) 834 | |
> [512, 1K) 998 | |
> [1K, 2K) 22708 | |
> [2K, 4K) 171247 | |
> [4K, 8K) 2853227 |@@@@@@@@ |
> [8K, 16K) 17727445 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [16K, 32K) 9523050 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
> [32K, 64K) 752423 |@@ |
> [64K, 128K) 135560 | |
> [128K, 256K) 52360 | |
> [256K, 512K) 4071 | |
> [512K, 1M) 57 | |
>
> @store_ns:
> [512, 1K) 518 | |
> [1K, 2K) 13337 | |
> [2K, 4K) 193043 | |
> [4K, 8K) 846118 | |
> [8K, 16K) 15240682 |@@@@@@@@@@@@@ |
> [16K, 32K) 60945786 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [32K, 64K) 10230719 |@@@@@@@@ |
> [64K, 128K) 1612647 |@ |
> [128K, 256K) 498344 | |
> [256K, 512K) 8550 | |
> [512K, 1M) 199 | |
> [1M, 2M) 1 | |
>
> *** z3fold ***
>
> @load_ns:
> [128, 256) 344 | |
> [256, 512) 999 | |
> [512, 1K) 859 | |
> [1K, 2K) 21069 | |
> [2K, 4K) 53704 | |
> [4K, 8K) 1351571 |@@@@ |
> [8K, 16K) 14142680 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [16K, 32K) 11788684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
> [32K, 64K) 1133377 |@@@@ |
> [64K, 128K) 121670 | |
> [128K, 256K) 68663 | |
> [256K, 512K) 120 | |
> [512K, 1M) 21 | |
>
> [512, 1K) 257 | |
> [1K, 2K) 10162 | |
> [2K, 4K) 149599 | |
> [4K, 8K) 648121 | |
> [8K, 16K) 9115497 |@@@@@@@@ |
> [16K, 32K) 56467456 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [32K, 64K) 16235236 |@@@@@@@@@@@@@@ |
> [64K, 128K) 1397437 |@ |
> [128K, 256K) 705916 | |
> [256K, 512K) 3087 | |
> [512K, 1M) 62 | |
> [1M, 2M) 1 | |
>
> I did not perform any sophisticated analysis on these histograms, but
> eyeballing them makes it clear that all allocators have somewhat
> similar latencies. zbud is slightly better than zsmalloc, and z3fold
> is slightly worse than zsmalloc. This corresponds naturally to the
> build times in (a).
>
> (c) Maximum size of the zswap pool
>
> *** zsmalloc ***
> 1,137,659,904 bytes = ~1.13G
>
> *** zbud ***
> 1,535,741,952 bytes = ~1.5G
>
> *** z3fold ***
> 1,151,303,680 bytes = ~1.15G
>
> zbud consumes ~32.7% more memory, and z3fold consumes ~1.8% more
> memory. This makes sense because zbud only stores a maximum of two
> compressed pages on each order-0 page, regardless of the compression
> ratio, so it is bound to consume more memory.
>
> -------------------------------- </Results> --------------------------------
>
> According to those results, it seems like zsmalloc is superior to
> z3fold in both efficiency and latency. Zbud has a small latency
> advantage, but that comes with a huge cost in terms of memory
> consumption. Moreover, most known users of zswap are currently using
> zsmalloc. Perhaps some folks are using zbud because it was the default
> allocator up until recently. The only known disadvantage of zsmalloc
> is the dependency on MMU.
>
> Based on that, I think it doesn't make sense to keep all 3 allocators
> going forward. I believe we should start with removing either zbud or
> z3fold, leaving only one allocator supporting MMU. Once zsmalloc
> supports !MMU (if possible), we can keep zsmalloc as the only
> allocator.

Hi Yosry, that sounds greate to me.

I was reviewing the code for allocators recently and couldn't find the
advantages of z3fold even without doing performance testing.

It would be better if there was only one allocator which would simplify
the code and interface.

>
> Thoughts and feedback are highly appreciated. I tried to CC all the
> interested folks, but others feel free to chime in.
>