[PATCH v9 00/13] rcu: call_rcu() power improvements

From: Joel Fernandes (Google)
Date: Sun Oct 16 2022 - 12:23:47 EST


v9 version of RCU lazy patches based on rcu/next branch.
Only change since v8 is this discussion:
https://lore.kernel.org/rcu/20221011180142.2742289-1-joel@xxxxxxxxxxxxxxxxx/T/#m8eff15110477f3430b3b02561b66f7b0d34a73b0

To facilitate easier merge, I dropped tracing and other patches and just
implemented the new changes. I will post the tracing patches later along with
rcutop as I need to add new tracepoints that Frederic suggested.

Main recent changes:
1. rcu_barrier() wake up only for lazy bypass list.
2. Make all call_rcu() default-lazy and add call_rcu_flush() API.
3. Take care of some callers using call_rcu_flush() API.
4. Several refactorings suggested by Paul/Frederic.
5. New call_rcu() to call_rcu_flush() conversions by Joel/Vlad/Paul.

I am seeing good performance and power with these patches on real ChromeOS x86
asymmetric hardware.

Earlier cover letter with lots of details is here:
https://lore.kernel.org/all/20220901221720.1105021-1-joel@xxxxxxxxxxxxxxxxx/

List of recent changes:

[ Frederic Weisbec: Program the lazy timer only if WAKE_NOT, since other
deferral levels wake much earlier so for those it is not needed. ]

[ Frederic Weisbec: Use flush flags to keep bypass API code clean. ]

[ Frederic Weisbec: Make rcu_barrier() wake up only if main list empty. ]

[ Frederic Weisbec: Remove extra 'else if' branch in rcu_nocb_try_bypass(). ]

[ Joel: Fix issue where I was not resetting lazy_len after moving it to rdp ]

[ Paul/Thomas/Joel: Make call_rcu() default lazy so users don't mess up. ]

[ Paul/Frederic : Cosmetic changes, split out wakeup of nocb thread. ]

[ Vlad/Joel : More call_rcu -> flush conversions ]

[ debug code for detecting "wake" in kernel's call_rcu() callbacks. ]

The following 2 scripts can be used to check if any callbacks in the kernel are
doing a wake up (it is best effort and may miss some things, but we found
issues using it)

1. Script to search for call_rcu() references and dump the callback list to a file:
#!/bin/bash

rm func-list
touch func-list

for f in $(find . \( -name "*.c" -o -name "*.h" \) | grep -v rcu); do

funcs=$(perl -0777 -ne 'while(m/call_rcu\([&]?.+,\s?(.+)\).*;/g){print "$1\n";}' $f)

if [ "x$funcs" != "x" ]; then
for func in $funcs; do
echo "$f $func" >> func-list
echo "$f $func"
done
fi

done

cat func-list | sort | uniq | tee func-list-sorted

2. Script to search "wake" after callback references:

#!/bin/bash

while read fl; do
file=$(echo $fl | cut -d " " -f1)
func=$(echo $fl | cut -d " " -f2)

grep -A 30 $func $file | grep wake > /dev/null

if [ $? -eq 0 ]; then
echo "keyword wake found after function reference $func in $file"
echo "Output:"
grep -A 30 $func $file
echo "==========================================================="
fi
done < func-list-sorted

Frederic Weisbecker (1):
rcu: Fix missing nocb gp wake on rcu_barrier()

Joel Fernandes (Google) (9):
rcu: Make call_rcu() lazy to save power
rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
rcuscale: Add laziness and kfree tests
percpu-refcount: Use call_rcu_flush() for atomic switch
rcu/sync: Use call_rcu_flush() instead of call_rcu
rcu/rcuscale: Use call_rcu_flush() for async reader test
rcu/rcutorture: Use call_rcu_flush() where needed
rxrpc: Use call_rcu_flush() instead of call_rcu()
rcu/debug: Add wake-up debugging for lazy callbacks

Uladzislau Rezki (2):
scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()
workqueue: Make queue_rcu_work() use call_rcu_flush()

Vineeth Pillai (1):
rcu: shrinker for lazy rcu

drivers/scsi/scsi_error.c | 2 +-
include/linux/rcupdate.h | 7 ++
kernel/rcu/Kconfig | 15 +++
kernel/rcu/lazy-debug.h | 154 +++++++++++++++++++++++++++
kernel/rcu/rcu.h | 8 ++
kernel/rcu/rcuscale.c | 70 +++++++++++-
kernel/rcu/rcutorture.c | 16 +--
kernel/rcu/sync.c | 2 +-
kernel/rcu/tiny.c | 2 +-
kernel/rcu/tree.c | 149 ++++++++++++++++++--------
kernel/rcu/tree.h | 12 ++-
kernel/rcu/tree_exp.h | 2 +-
kernel/rcu/tree_nocb.h | 217 ++++++++++++++++++++++++++++++++------
kernel/workqueue.c | 2 +-
lib/percpu-refcount.c | 3 +-
net/rxrpc/conn_object.c | 2 +-
16 files changed, 565 insertions(+), 98 deletions(-)
create mode 100644 kernel/rcu/lazy-debug.h

--
2.38.0.413.g74048e4d9e-goog