[PATCH v3 0/3] cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()

From: Waiman Long
Date: Fri Nov 03 2023 - 23:14:21 EST


v3:
- Minor comment twisting as suggested by Yosry.
- Add patches 2 and 3 to further reduce lock hold time

The purpose of this patch series is to reduce of the cpu_lock hold time
in cgroup_rstat_flush_locked() so as to reduce the latency impact when
cgroup_rstat_updated() is called as they may contend with each other
on the cpu_lock.

A parallel kernel build on a 2-socket x86-64 server is used as the
benchmarking tool for measuring the lock hold time. Below were the lock
hold time frequency distribution before and after applying different
number of patches:

Hold time Before patch Patch 1 Patches 1-2 Patches 1-3
--------- ------------ ------- ----------- -----------
0-01 us 804,139 13,738,708 14,594,545 15,484,707
01-05 us 9,772,767 1,177,194 439,926 207,382
05-10 us 4,595,028 4,984 5,960 3,174
10-15 us 303,481 3,562 3,543 3,006
15-20 us 78,971 1,314 1,397 1,066
20-25 us 24,583 18 25 15
25-30 us 6,908 12 12 10
30-40 us 8,015
40-50 us 2,192
50-60 us 316
60-70 us 43
70-80 us 7
80-90 us 2
>90 us 3

Waiman Long (3):
cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()
cgroup/rstat: Optimize cgroup_rstat_updated_list()
cgroup: Avoid false cacheline sharing of read mostly rstat_cpu

include/linux/cgroup-defs.h | 14 ++++
kernel/cgroup/rstat.c | 129 +++++++++++++++++++++---------------
2 files changed, 89 insertions(+), 54 deletions(-)

--
2.39.3