Re: [PATCH v6 10/12] crypto: x86/aes - Prepare for a new AES implementation

From: Chang S. Bae
Date: Mon May 08 2023 - 20:55:45 EST


On 5/5/2023 4:27 PM, Eric Biggers wrote:
On Mon, Apr 10, 2023 at 03:59:34PM -0700, Chang S. Bae wrote:
Refactor the common C code to avoid code duplication. The AES-NI code uses
it with a function pointer argument to call back the AES-NI assembly code.
So will the AES-KL code.

Actually, the AES-NI XTS glue code currently makes direct calls to the assembly
code. This patch changes it to make indirect calls. Indirect calls are very
expensive these days, partly due to all the speculative execution mitigations.
So this patch likely causes a performance regression. How about making
xts_crypt_common() and xts_setkey_common() be inline functions?

I guess this series is relevant:
https://lore.kernel.org/lkml/20201222160629.22268-1-ardb@xxxxxxxxxx/#t

Yeah, inlining those looks to be just a cut-and-paste work. Then I was curious about the performance impact.

So I picked one of my old machines. Then, I was able to quickly run through with these cases:

$ cryptsetup benchmark -c aes-xts -s 256

# Tests are approximate using memory only (no storage IO).
# Algorithm | Key | Encryption | Decryption

Upstream (6.4-rc1)
aes-xts 256b 3949.3 MiB/s 4014.2 MiB/s
aes-xts 256b 4016.1 MiB/s 4011.6 MiB/s
aes-xts 256b 4026.2 MiB/s 4018.4 MiB/s
aes-xts 256b 4009.2 MiB/s 4006.9 MiB/s
aes-xts 256b 4025.0 MiB/s 4016.4 MiB/s

Upstream + V6
aes-xts 256b 3876.1 MiB/s 3963.6 MiB/s
aes-xts 256b 3926.3 MiB/s 3984.2 MiB/s
aes-xts 256b 3940.8 MiB/s 3961.2 MiB/s
aes-xts 256b 3929.7 MiB/s 3984.7 MiB/s
aes-xts 256b 3892.5 MiB/s 3942.5 MiB/s

Upstream + V6 + {inlined helpers}
aes-xts 256b 3996.9 MiB/s 4085.4 MiB/s
aes-xts 256b 4087.6 MiB/s 4104.9 MiB/s
aes-xts 256b 4117.9 MiB/s 4130.2 MiB/s
aes-xts 256b 4098.4 MiB/s 4120.6 MiB/s
aes-xts 256b 4095.5 MiB/s 4111.5 MiB/s

Okay, I'm a bit more convinced with this inlining. Will try to comment this along with the change.

Another issue with having the above be exported symbols is that their names are
too generic, so they could easily collide with another symbols in the kernel.
To be exported symbols, they would need something x86-specific in their names.

I think that's another good point though, they don't need to be exported once moved into the header so that inlined.

arch/x86/crypto/Makefile | 2 +-
arch/x86/crypto/aes-intel_asm.S | 26 ++++
arch/x86/crypto/aes-intel_glue.c | 127 ++++++++++++++++
arch/x86/crypto/aes-intel_glue.h | 44 ++++++
arch/x86/crypto/aesni-intel_asm.S | 58 +++----
arch/x86/crypto/aesni-intel_glue.c | 235 +++++++++--------------------
arch/x86/crypto/aesni-intel_glue.h | 17 +++

It's confusing having aes-intel, aesni-intel, *and* aeskl-intel. Maybe call the
first one "aes-helpers" or "aes-common" instead?

Yeah, I can see a few files named with helper. So, maybe s/aes-intel/aes-helpers/.

+struct aes_xts_ctx {
+ u8 raw_tweak_ctx[sizeof(struct crypto_aes_ctx)] AES_ALIGN_ATTR;
+ u8 raw_crypt_ctx[sizeof(struct crypto_aes_ctx)] AES_ALIGN_ATTR;
+}; >
This struct does not make sense. It should look like:

struct aes_xts_ctx {
struct crypto_aes_ctx tweak_ctx AES_ALIGN_ATTR;
struct crypto_aes_ctx crypt_ctx AES_ALIGN_ATTR;
};

The runtime alignment to a 16-byte boundary should happen when translating the
raw crypto_skcipher_ctx() into the pointer to the aes_xts_ctx. It should not
happen when accessing each individual field in the aes_xts_ctx.

Oh, ugly. This came from mindless copy & paste here. I guess the fix could be a standalone patch. Or, it can be fixed along with this mess.

/*
- * int aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- * unsigned int key_len)
+ * int _aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
+ * unsigned int key_len)
*/

It's conventional to use two leading underscores, not one.

Yes, will fix.

Thanks,
Chang