Re: [PATCH v2] mm/slub: disable slab merging in the default configuration

From: David Rientjes
Date: Sun Jul 09 2023 - 04:56:01 EST


On Thu, 6 Jul 2023, David Rientjes wrote:

> On Mon, 3 Jul 2023, David Rientjes wrote:
>
> > hackbench
>
> Running hackbench on Skylake with v6.1.30 (A) and v6.1.30 + your patch
> (B), for example:
>
> LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
> --------------------------------+-------+------------+------------+------------+------------+-----------+----------------
> SReclaimable | | | | | | |
> (A) v6.1.30 | 11 | 129480.000 | 233208.000 | 189936.364 | 204316.000 | 31465.625 |
> (B) <same sha> | 11 | 139084.000 | 236772.000 | 198931.273 | 213672.000 | 30013.204 |
> | | +7.42% | +1.53% | +4.74% | +4.58% | -4.62% | <not defined>
> SUnreclaim | | | | | | |
> (A) v6.1.30 | 11 | 305400.000 | 538744.000 | 422148.000 | 449344.000 | 65005.045 |
> (B) <same sha> | 11 | 305780.000 | 518300.000 | 422219.636 | 450252.000 | 61245.137 |
> | | +0.12% | -3.79% | +0.02% | +0.20% | -5.78% | <not defined>
>
> Amount of reclaimable slab significantly increases which is likely not a
> problem because, well, it's reclaimable. But I suspect we'll find other
> interesting data points with the other suggested benchmarks.
>
> And benchmark results:
>
> LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
> --------------------------------+-------+------------+------------+------------+------------+-----------+----------------
> hackbench_process_pipes_234 | | | | | | |
> (A) v6.1.30 | 7 | 1.735 | 1.979 | 1.831 | 1.835 | 0.086291 |
> (B) <same sha> | 7 | 1.687 | 2.023 | 1.886 | 1.911 | 0.10276 |
> | | -2.77% | +2.22% | +3.00% | +4.14% | +19.09% | <not defined>
> hackbench_process_pipes_max | | | | | | |
> (A) v6.1.30 | 7 | 1.735 | 1.979 | 1.831 | 1.835 | 0.086291 |
> (B) <same sha> | 7 | 1.687 | 2.023 | 1.886 | 1.911 | 0.10276 |
> | | -2.77% | +2.22% | +3.00% | +4.14% | +19.09% | - is good
> hackbench_process_sockets_234 | | | | | | |
> (A) v6.1.30 | 7 | 7.883 | 7.909 | 7.899 | 7.899 | 0.0087808 |
> (B) <same sha> | 7 | 7.872 | 7.961 | 7.907 | 7.904 | 0.028019 |
> | | -0.14% | +0.66% | +0.10% | +0.06% | +219.09% | <not defined>
> hackbench_process_sockets_max | | | | | | |
> (A) v6.1.30 | 7 | 7.883 | 7.909 | 7.899 | 7.899 | 0.0087808 |
> (B) <same sha> | 7 | 7.872 | 7.961 | 7.907 | 7.904 | 0.028019 |
> | | -0.14% | +0.66% | +0.10% | +0.06% | +219.09% | - is good
> hackbench_thread_pipes_234 | | | | | | |
> (A) v6.1.30 | 7 | 2.146 | 2.677 | 2.410 | 2.418 | 0.18143 |
> (B) <same sha> | 7 | 2.016 | 2.514 | 2.268 | 2.241 | 0.17474 |
> | | -6.06% | -6.09% | -5.88% | -7.32% | -3.69% | <not defined>
> hackbench_thread_pipes_max | | | | | | |
> (A) v6.1.30 | 7 | 2.146 | 2.677 | 2.410 | 2.418 | 0.18143 |
> (B) <same sha> | 7 | 2.016 | 2.514 | 2.268 | 2.241 | 0.17474 |
> | | -6.06% | -6.09% | -5.88% | -7.32% | -3.69% | - is good
> hackbench_thread_sockets_234 | | | | | | |
> (A) v6.1.30 | 7 | 8.025 | 8.127 | 8.084 | 8.085 | 0.029755 |
> (B) <same sha> | 7 | 7.990 | 8.093 | 8.042 | 8.035 | 0.035152 |
> | | -0.44% | -0.42% | -0.53% | -0.62% | +18.14% | <not defined>
> hackbench_thread_sockets_max | | | | | | |
> (A) v6.1.30 | 7 | 8.025 | 8.127 | 8.084 | 8.085 | 0.029755 |
> (B) <same sha> | 7 | 7.990 | 8.093 | 8.042 | 8.035 | 0.035152 |
> | | -0.44% | -0.42% | -0.53% | -0.62% | +18.14% | - is good

My takeaway from running half a dozen benchmarks on Intel is that
performance is more impacted than slab memory usage. There are slight
regressions in memory usage, but only measurable for SReclaimable which
would be the better form (as opposed to SUnreclaimable).

There are some substantial performance degradations, most notably
context_switch1_per_thread_ops which regressed ~21%. I'll need to repeat
that test to confirm it and can also try on cascadelake if it reproduces.

There are some more negligible redis, specjbb, and will-it-scale
regressions which don't look terribly concerning.

I'll try running performance tests on AMD Zen3 and also ARM with
PAGE_SIZE == 4KB and 64KB.

Unixbench memory usage and performance is within +/- 1% for every metric,
so it's not presented here.

Full results for Skylake, removing results where mean is +/- 1% of
baseline:

============================== MEMORY USAGE ==============================

hackbench
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
--------------------------------+-------+------------+------------+------------+------------+-----------+----------------
SReclaimable | | | | | | |
(A) v6.1.30 | 11 | 129480.000 | 233208.000 | 189936.364 | 204316.000 | 31465.625 |
(B) v6.1.30 slab_nomerge | 11 | 139084.000 | 236772.000 | 198931.273 | 213672.000 | 30013.204 |
| | +7.42% | +1.53% | +4.74% | +4.58% | -4.62% | - is good

redis
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
-------------------------------+-------+------------+------------+------------+------------+-----------+----------------
SReclaimable | | | | | | |
(A) v6.1.30 | 298 | 137056.000 | 238664.000 | 226005.477 | 226940.000 | 8109.328 |
(B) v6.1.30 slab_nomerge | 302 | 139664.000 | 242664.000 | 229096.689 | 230098.000 | 8215.134 |
| | +1.90% | +1.68% | +1.37% | +1.39% | +1.30% | - is good

specjbb2015
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
-----------------------------------+-------+------------+------------+------------+------------+----------+----------------
SReclaimable | | | | | | |
(A) v6.1.30 | 1602 | 118344.000 | 217932.000 | 203559.618 | 205372.000 | 5314.410 |
(B) v6.1.30 slab_nomerge | 1655 | 128000.000 | 222536.000 | 208099.973 | 209396.000 | 4608.582 |
| | +8.16% | +2.11% | +2.23% | +1.96% | -13.28% | - is good

============================== PERFORMANCE ==============================

hackbench
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
--------------------------------+-------+------------+------------+------------+------------+-----------+----------------
hackbench_process_pipes_234 | | | | | | |
(A) v6.1.30 | 7 | 1.735 | 1.979 | 1.831 | 1.835 | 0.086291 |
(B) v6.1.30 slab_nomerge | 7 | 1.687 | 2.023 | 1.886 | 1.911 | 0.10276 |
| | -2.77% | +2.22% | +3.00% | +4.14% | +19.09% | - is good
hackbench_thread_pipes_234 | | | | | | |
(A) v6.1.30 | 7 | 2.146 | 2.677 | 2.410 | 2.418 | 0.18143 |
(B) v6.1.30 slab_nomerge | 7 | 2.016 | 2.514 | 2.268 | 2.241 | 0.17474 |
| | -6.06% | -6.09% | -5.88% | -7.32% | -3.69% | - is good

redis
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
-------------------------------+-------+------------+------------+------------+------------+-----------+----------------
redis_medium_max_INCR | | | | | | |
(A) v6.1.30 | 5 | 108695.660 | 112637.980 | 110639.626 | 109757.440 | 1668.190 |
(B) v6.1.30 slab_nomerge | 5 | 101853.740 | 106564.370 | 104166.478 | 104942.800 | 1833.377 |
| | -6.29% | -5.39% | -5.85% | -4.39% | +9.90% | + is good
redis_medium_max_LPOP | | | | | | |
(A) v6.1.30 | 5 | 102944.200 | 108471.630 | 105572.750 | 106303.820 | 2016.986 |
(B) v6.1.30 slab_nomerge | 5 | 101471.340 | 104231.810 | 103361.688 | 104090.770 | 1064.277 |
| | -1.43% | -3.91% | -2.09% | -2.08% | -47.23% | + is good
redis_medium_max_LPUSH | | | | | | |
(A) v6.1.30 | 10 | 99255.590 | 108295.430 | 105960.440 | 106338.120 | 2553.802 |
(B) v6.1.30 slab_nomerge | 10 | 100130.160 | 107032.000 | 104335.070 | 105091.705 | 2169.708 |
| | +0.88% | -1.17% | -1.53% | -1.17% | -15.04% | + is good
redis_medium_max_LRANGE_100 | | | | | | |
(A) v6.1.30 | 5 | 72427.030 | 73046.020 | 72671.814 | 72626.910 | 202.812 |
(B) v6.1.30 slab_nomerge | 5 | 70811.500 | 72030.540 | 71519.286 | 71761.750 | 450.918 |
| | -2.23% | -1.39% | -1.59% | -1.19% | +122.33% | + is good
redis_medium_max_MSET_10 | | | | | | |
(A) v6.1.30 | 5 | 87642.420 | 89798.850 | 89044.390 | 89102.740 | 769.933 |
(B) v6.1.30 slab_nomerge | 5 | 85287.840 | 89758.550 | 87876.598 | 88386.070 | 1641.608 |
| | -2.69% | -0.04% | -1.31% | -0.80% | +113.21% | + is good
redis_medium_max_PING_BULK | | | | | | |
(A) v6.1.30 | 5 | 101729.400 | 108189.980 | 105003.228 | 105307.490 | 2171.756 |
(B) v6.1.30 slab_nomerge | 5 | 100553.050 | 105340.770 | 102561.464 | 101947.190 | 1789.953 |
| | -1.16% | -2.63% | -2.33% | -3.19% | -17.58% | + is good
redis_medium_max_PING_INLINE | | | | | | |
(A) v6.1.30 | 5 | 102522.050 | 107503.770 | 105209.902 | 106033.300 | 1981.499 |
(B) v6.1.30 slab_nomerge | 5 | 97541.950 | 107319.170 | 103729.414 | 104854.780 | 3304.256 |
| | -4.86% | -0.17% | -1.41% | -1.11% | +66.76% | + is good
redis_medium_max_SET | | | | | | |
(A) v6.1.30 | 5 | 105663.570 | 112283.850 | 108917.118 | 109469.070 | 2663.234 |
(B) v6.1.30 slab_nomerge | 5 | 103071.540 | 106723.590 | 105128.226 | 106179.660 | 1666.892 |
| | -2.45% | -4.95% | -3.48% | -3.00% | -37.41% | + is good
redis_medium_max_SPOP | | | | | | |
(A) v6.1.30 | 5 | 104079.940 | 107238.610 | 105140.616 | 104964.840 | 1150.370 |
(B) v6.1.30 slab_nomerge | 5 | 102637.790 | 103885.300 | 103343.934 | 103412.620 | 437.159 |
| | -1.39% | -3.13% | -1.71% | -1.48% | -62.00% | + is good
redis_small_max_INCR | | | | | | |
(A) v6.1.30 | 5 | 98814.230 | 114942.530 | 107744.856 | 108813.920 | 6150.540 |
(B) v6.1.30 slab_nomerge | 5 | 99800.400 | 109529.020 | 104451.708 | 104058.270 | 3732.461 |
| | +1.00% | -4.71% | -3.06% | -4.37% | -39.31% | + is good
redis_small_max_LPOP | | | | | | |
(A) v6.1.30 | 5 | 104275.290 | 118764.840 | 108648.192 | 106951.880 | 5208.918 |
(B) v6.1.30 slab_nomerge | 5 | 97560.980 | 115074.800 | 103120.496 | 99800.400 | 6353.203 |
| | -6.44% | -3.11% | -5.09% | -6.69% | +21.97% | + is good
redis_small_max_LRANGE_100 | | | | | | |
(A) v6.1.30 | 5 | 67980.970 | 72992.700 | 71589.644 | 72150.070 | 1832.810 |
(B) v6.1.30 slab_nomerge | 5 | 64977.260 | 72046.110 | 70273.716 | 71684.590 | 2680.854 |
| | -4.42% | -1.30% | -1.84% | -0.65% | +46.27% | + is good
redis_small_max_MSET_10 | | | | | | |
(A) v6.1.30 | 5 | 90497.730 | 106044.540 | 100756.422 | 102880.660 | 5455.768 |
(B) v6.1.30 slab_nomerge | 5 | 97276.270 | 106951.880 | 102818.856 | 102880.660 | 3293.135 |
| | +7.49% | +0.86% | +2.05% | +0.00% | -39.64% | + is good
redis_small_max_PING_INLINE | | | | | | |
(A) v6.1.30 | 5 | 96153.850 | 108459.870 | 102493.414 | 102459.020 | 4995.757 |
(B) v6.1.30 slab_nomerge | 5 | 84317.030 | 116144.020 | 99995.920 | 98039.220 | 11045.861 |
| | -12.31% | +7.08% | -2.44% | -4.31% | +121.10% | + is good
redis_small_max_SADD | | | | | | |
(A) v6.1.30 | 5 | 106044.540 | 115606.940 | 109804.052 | 110375.270 | 3451.251 |
(B) v6.1.30 slab_nomerge | 5 | 95693.780 | 109769.480 | 102329.518 | 102249.490 | 4602.161 |
| | -9.76% | -5.05% | -6.81% | -7.36% | +33.35% | + is good
redis_small_max_SET | | | | | | |
(A) v6.1.30 | 5 | 91911.760 | 116686.120 | 104509.200 | 102354.150 | 8993.532 |
(B) v6.1.30 slab_nomerge | 5 | 100502.520 | 113636.370 | 108815.700 | 109649.120 | 4750.002 |
| | +9.35% | -2.61% | +4.12% | +7.13% | -47.18% | + is good
redis_small_max_SPOP | | | | | | |
(A) v6.1.30 | 5 | 96899.230 | 108695.650 | 103648.652 | 104931.800 | 3901.567 |
(B) v6.1.30 slab_nomerge | 5 | 93457.940 | 108108.110 | 101680.560 | 101626.020 | 5096.944 |
| | -3.55% | -0.54% | -1.90% | -3.15% | +30.64% | + is good

specjbb2015
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
-----------------------------------+-------+------------+------------+------------+------------+----------+----------------
specjbb2015_single_Critical_JOPS | | | | | | |
(A) v6.1.30 | 1 | 46294.000 | 46294.000 | 46294.000 | 46294.000 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 46167.000 | 46167.000 | 46167.000 | 46167.000 | 0 |
| | -0.27% | -0.27% | -0.27% | -0.27% | --- | + is good
specjbb2015_single_Max_JOPS | | | | | | |
(A) v6.1.30 | 1 | 68842.000 | 68842.000 | 68842.000 | 68842.000 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 67801.000 | 67801.000 | 67801.000 | 67801.000 | 0 |
| | -1.51% | -1.51% | -1.51% | -1.51% | --- | + is good

vm-scalability
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
---------------------------------------+-------+-----------------+-----------------+-----------------+-----------------+---------------+------------
300s_128G_truncate_throughput | | | | | | |
(A) v6.1.30 | 15 | 16398714804.000 | 17010339870.000 | 16772025703.867 | 16834675132.000 | 232697088.501 |
(B) v6.1.30 slab_nomerge | 15 | 16704416343.000 | 17271437122.000 | 16948419991.200 | 16821799877.000 | 233146680.475 |
| | +1.86% | +1.53% | +1.05% | -0.08% | +0.19% | + is good
300s_512G_anon_wx_rand_mt_throughput | | | | | | |
(A) v6.1.30 | 15 | 7198561.000 | 7359712.000 | 7263944.200 | 7259418.000 | 50394.115 |
(B) v6.1.30 slab_nomerge | 15 | 7191842.000 | 7628158.000 | 7390629.000 | 7407204.000 | 171602.612 |
| | -0.09% | +3.65% | +1.74% | +2.04% | +240.52% | + is good

will-it-scale
LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV | DIRECTION
-----------------------------------+-------+--------------+--------------+--------------+--------------+-----------+----------------
context_switch1_per_thread_ops | | | | | | |
(A) v6.1.30 | 1 | 324721.000 | 324721.000 | 324721.000 | 324721.000 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 255999.000 | 255999.000 | 255999.000 | 255999.000 | 0 |
!! REGRESSED !! | | -21.16% | -21.16% | -21.16% | -21.16% | --- | + is good
getppid1_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.71943 | 0.71943 | 0.71943 | 0.71943 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.70923 | 0.70923 | 0.70923 | 0.70923 | 0 |
| | -1.42% | -1.42% | -1.42% | -1.42% | --- | + is good
mmap1_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.18831 | 0.18831 | 0.18831 | 0.18831 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.18413 | 0.18413 | 0.18413 | 0.18413 | 0 |
| | -2.22% | -2.22% | -2.22% | -2.22% | --- | + is good
poll2_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.45608 | 0.45608 | 0.45608 | 0.45608 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.44207 | 0.44207 | 0.44207 | 0.44207 | 0 |
| | -3.07% | -3.07% | -3.07% | -3.07% | --- | + is good
pthread_mutex1_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.45207 | 0.45207 | 0.45207 | 0.45207 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.44194 | 0.44194 | 0.44194 | 0.44194 | 0 |
| | -2.24% | -2.24% | -2.24% | -2.24% | --- | + is good
pthread_mutex2_per_process_ops | | | | | | |
(A) v6.1.30 | 1 | 36292960.000 | 36292960.000 | 36292960.000 | 36292960.000 | 0 |
(B) <v6.1.30 slab_nomerge | 1 | 35446930.000 | 35446930.000 | 35446930.000 | 35446930.000 | 0 |
| | -2.33% | -2.33% | -2.33% | -2.33% | --- | + is good
signal1_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.55541 | 0.55541 | 0.55541 | 0.55541 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.54773 | 0.54773 | 0.54773 | 0.54773 | 0 |
| | -1.38% | -1.38% | -1.38% | -1.38% | --- | + is good
unix1_scalability | | | | | | |
(A) v6.1.30 | 1 | 0.55085 | 0.55085 | 0.55085 | 0.55085 | 0 |
(B) v6.1.30 slab_nomerge | 1 | 0.53957 | 0.53957 | 0.53957 | 0.53957 | 0 |
| | -2.05% | -2.05% | -2.05% | -2.05% | --- | + is good