[PATCH v2 0/9] swapin refactor for optimization and unified readahead

From: Kairui Song
Date: Tue Jan 02 2024 - 12:54:01 EST


From: Kairui Song <kasong@xxxxxxxxxxx>

This series is rebased on latest mm-stable to avoid conflicts.

This series tries to unify and clean up the swapin path, introduce minor
optimization, and make both shmem swapoff make use of SWP_SYNCHRONOUS_IO
flag to skip readahead and swapcache for better performance.

1. Some benchmark for dropping readahead and swapcache for shmem with ZRAM:

- Single file sequence read:
perf stat --repeat 20 dd if=/tmpfs/test of=/dev/null bs=1M count=8192
(/tmpfs/test is a zero filled file, using brd as swap, 4G memcg limit)
Before: 22.248 +- 0.549
After: 22.021 +- 0.684 (-1.1%)

- Random read stress test:
fio -name=tmpfs --numjobs=16 --directory=/tmpfs \
--size=256m --ioengine=mmap --rw=randread --random_distribution=random \
--time_based --ramp_time=1m --runtime=5m --group_reporting
(using brd as swap, 2G memcg limit)

Before: 1818MiB/s
After: 1888MiB/s (+3.85%)

- Zipf biased random read stress test:
fio -name=tmpfs --numjobs=16 --directory=/tmpfs \
--size=256m --ioengine=mmap --rw=randread --random_distribution=zipf:1.2 \
--time_based --ramp_time=1m --runtime=5m --group_reporting
(using brd as swap, 2G memcg limit)

Before: 31.1GiB/s
After: 32.3GiB/s (+3.86%)

Previously, shmem always used cluster readahead, it doesn't help much even
for single sequence read, and for random stress tests, the performance is
better without it. In reality, due to memory and swap fragmentation cluster
read-head is less helpful for ZRAM.

2. Micro benchmark which use madvise to swap out 10G zero-filled data to
ZRAM then read them in, shows a performance gain for swapin path:

Before: 11143285 us
After: 10692644 us (+4.1%)

3. Swap off an 10G ZRAM:

Before:
time swapoff /dev/zram0
real 0m12.337s
user 0m0.001s
sys 0m12.329s

After:
time swapoff /dev/zram0
real 0m9.728s
user 0m0.001s
sys 0m9.719s

This also clean up the path to apply a per swap device readahead
policy for all swapin paths.

V1: https://lkml.org/lkml/2023/11/19/296
Update from V1:
- Rebased based on mm-unstable.
- Remove behaviour changing patches, will submit in seperate series
later.
- Code style, naming and comments updates.
- Thanks to Chris Li for very detailed and helpful review of V1. Thanks
to Matthew Wilcox and Huang Ying for helpful suggestions.

Kairui Song (9):
mm/swapfile.c: add back some comment
mm/swap: move no readahead swapin code to a stand-alone helper
mm/swap: avoid doing extra unlock error checks for direct swapin
mm/swap: always account swapped in page into current memcg
mm/swap: introduce swapin_entry for unified readahead policy
mm/swap: also handle swapcache lookup in swapin_entry
mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO
mm/swap: introduce a helper for swapin without vmfault
swap, shmem: use new swapin helper and skip readahead conditionally

mm/memory.c | 74 +++++++-------------------
mm/shmem.c | 67 +++++++++++------------
mm/swap.h | 39 ++++++++++----
mm/swap_state.c | 138 +++++++++++++++++++++++++++++++++++++++++-------
mm/swapfile.c | 32 +++++------
5 files changed, 218 insertions(+), 132 deletions(-)

--
2.43.0