[GIT PULL] percpu changes for v6.6-rc1

From: Dennis Zhou
Date: Wed Aug 30 2023 - 20:10:14 EST


Hi Linus,

There is 1 bigger change to percpu_counter's api allowing for init and
destroy of multiple counters via percpu_counter_init_many() and
percpu_counter_destroy_many(). This is used to help begin remediating a
performance regression with percpu rss stats.

Additionally, it seems larger core count machines are feeling the burden
of the single threaded allocation of percpu. Mateusz is thinking about
it and I will spend some time on it too.

Thanks,
Dennis

The following changes since commit 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4:

Linux 6.5-rc4 (2023-07-30 13:23:47 -0700)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git tags/percpu-for-6.6

for you to fetch changes up to 14ef95be6f5558fb9e43aaf06ef9a1d6e0cae6c8:

kernel/fork: group allocation/free of per-cpu counters for mm struct (2023-08-25 08:10:35 -0700)

----------------------------------------------------------------
percpu: changes for v6.6

percpu
* A couple cleanups by Baoquan He and Bibo Mao. The only behavior change
is to start printing messages if we're under the warn limit for failed
atomic allocations.

percpu_counter
* Shakeel introduced percpu counters into mm_struct which caused percpu
allocations be on the hot path [1]. Originally I spent some time
trying to improve the percpu allocator, but instead preferred what
Mateusz Guzik proposed grouping at the allocation site,
percpu_counter_init_many(). This allows a single percpu allocation to
be shared by the counters. I like this approach because it creates a
shared lifetime by the allocations. Additionally, I believe many inits
have higher level synchronization requirements, like percpu_counter
does against HOTPLUG_CPU. Therefore we can group these optimizations
together.

[1] https://lore.kernel.org/linux-mm/20221024052841.3291983-1-shakeelb@xxxxxxxxxx/

----------------------------------------------------------------
Baoquan He (3):
mm/percpu.c: remove redundant check
mm/percpu.c: optimize the code in pcpu_setup_first_chunk() a little bit
mm/percpu.c: print error message too if atomic alloc failed

Bibo Mao (1):
mm/percpu: Remove some local variables in pcpu_populate_pte

Mateusz Guzik (2):
pcpcntr: add group allocation/free
kernel/fork: group allocation/free of per-cpu counters for mm struct

include/linux/percpu_counter.h | 41 ++++++++++++++++++++-----
kernel/fork.c | 15 +++------
lib/percpu_counter.c | 62 +++++++++++++++++++++++++------------
mm/percpu.c | 69 +++++++++++++++++-------------------------
4 files changed, 109 insertions(+), 78 deletions(-)