[RFC PATCH 0/7] crypto: x86 - fix RCU stalls

From: Robert Elliott
Date: Thu Oct 06 2022 - 18:34:02 EST


This series attempts to fix the RCU stalls triggered
by the x86 crypto drivers discussed in
https://lore.kernel.org/all/MW5PR84MB18426EBBA3303770A8BC0BDFAB759@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

The following x86 drivers are enforcing a 4 KiB limit today,
using either the SZ_4K macro or a direct reference to 4096 bytes:
blake-2s, chacha, nhpoly1305, poly1305, polyval
I've included a patch to make them use the same macro name.

These are not currently limited, so I've included patches for
them:
sha*, crc*, sm3, ghash

I originally encountered some RCU stalls with tcrypt in aesni:
tcrypt: testing encryption speed of sync skcipher cts(cbc(aes)) using cts(cbc(aes-aesni))
tcrypt: testing encryption speed of sync skcipher cfb(aes) using cfb(aes-aesni)
tcrypt: testing decryption speed of sync skcipher cfb(aes) using cfb(aes-aesni)
but I don't see any problems in the source code. So, no patch
is proposed for that driver yet.

With various errors inserted, all the drivers failed self-tests or
hung boot, so the changes seem functionally correct. I haven't done
comprehensive tests of different data sizes and alignments, so
please consider this an RFC.

I added some counters (not posted) to the drivers to observe
their behavior. During boot, the finup function is actually
called much more often than update - 1500 calls for 2 GiB via
finup vs. 23 KiB via update. The patch breaks that into half
a million 4 KiB chunks.

/sys/module/sha512_ssse3/parameters/rob_call_finup:1541
/sys/module/sha512_ssse3/parameters/rob_call_finup_fpu:469325
/sys/module/sha512_ssse3/parameters/rob_call_update:174
/sys/module/sha512_ssse3/parameters/rob_call_update_fpu:32
/sys/module/sha512_ssse3/parameters/rob_len_finup:2123048456
/sys/module/sha512_ssse3/parameters/rob_len_update:24120


Robert Elliott (7):
rcu: correct CONFIG_EXT_RCU_CPU_STALL_TIMEOUT descriptions
crypto: x86/sha - limit FPU preemption
crypto: x86/crc - limit FPU preemption
crypto: x86/sm3 - limit FPU preemption
crypto: x86/ghash - restructure FPU context saving
crypto: x86/ghash - limit FPU preemption
crypto: x86 - use common macro for FPU limit

Documentation/RCU/stallwarn.rst | 9 +++---
arch/x86/crypto/blake2s-glue.c | 7 +++--
arch/x86/crypto/chacha_glue.c | 4 ++-
arch/x86/crypto/crc32-pclmul_glue.c | 18 ++++++++---
arch/x86/crypto/crc32c-intel_glue.c | 32 ++++++++++++++++----
arch/x86/crypto/crct10dif-pclmul_glue.c | 32 ++++++++++++++++----
arch/x86/crypto/ghash-clmulni-intel_glue.c | 32 +++++++++++++++-----
arch/x86/crypto/nhpoly1305-avx2-glue.c | 3 +-
arch/x86/crypto/nhpoly1305-sse2-glue.c | 4 ++-
arch/x86/crypto/poly1305_glue.c | 25 +++++++++-------
arch/x86/crypto/polyval-clmulni_glue.c | 5 ++--
arch/x86/crypto/sha1_ssse3_glue.c | 34 +++++++++++++++++----
arch/x86/crypto/sha256_ssse3_glue.c | 35 ++++++++++++++++++----
arch/x86/crypto/sha512_ssse3_glue.c | 35 ++++++++++++++++++----
arch/x86/crypto/sm3_avx_glue.c | 28 +++++++++++++----
kernel/rcu/Kconfig.debug | 2 +-
16 files changed, 237 insertions(+), 68 deletions(-)

--
2.37.3