Re: [PATCH v8 12/12] crypto: x86/aes-kl - Implement the AES-XTS algorithm

From: Chang S. Bae
Date: Wed Jun 07 2023 - 18:07:05 EST


On 6/6/2023 10:35 PM, Eric Biggers wrote:
On Sat, Jun 03, 2023 at 08:22:27AM -0700, Chang S. Bae wrote:

Can you also mention why you are doing this? I suppose it might as well be
done, but I'm not seeing how it would actually matter.

While this crypto implementation is in the kernel mode, userspace can call it:
https://docs.kernel.org/crypto/userspace-if.html

And those AES instructions are executable in userspace.

Say someone takes a key handle out of the kernel code and then decrypts some disk image from userspace. At least, this is enforced not to do.

What other sorts of key usage restrictions does AES-KL support? Are any other
ones useful here?

Besides this, there are additional bits to restrict using encryption and decryption respectively.

This can be found in Section 1.1.1.1 'Handle Restrictions' in its whitepaper:

https://www.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html

Subsequently the key handle could be corrupted or fail with handle
restrictions. Then, encrypt()/decrypt() returns -EINVAL.

Aren't these scenarios actually impossible? At least without memory corruption.

Yes, in the dm-crypt path, I think. But, the key handle can be tainted in the userspace -> API path.

I think this may help users as this feature can do some integrity checks at first and then populate an error right away if it goes wrong.
Thus, advertise it with a unique name 'xts-aes-aeskl' in /proc/crypto while
not replacing AES-NI under the generic name 'xts(aes)' with a lower priority.

The above sentence seems to say that xts-aes-aeskl does *not* have a lower
priority than xts-aes-aesni. But actually it does.

No, it does not say that. This needs to call out the latter part more clearly.

Then, the performance is unlikely better than 64-bit which has already a gap
vs. AES-NI.

I don't understand what this sentence is trying to say.

This is in another section for explaining why 64-bitness only. I kinda added another point to avoid 32-bit code. But, anyways it is known that 32-bit kernel mode is being deprecated. Then, the 128-bit register story seems to be enough there.

+config AS_HAS_KEYLOCKER
+ def_bool $(as-instr,encodekey256 %eax$(comma)%eax)
+ help
+ Supported by binutils >= 2.36 and LLVM integrated assembler >= V12

It looks like arch/x86/Kconfig.assembler would be a better place for this.

Yeah, the commit 5e8ebd841a44 ("x86: probe assembler capabilities via kconfig instead of makefile") moved those over there.

+
+#define IN1 %xmm8
+#define IN IN1

Why do both IN1 and IN exist? Shouldn't there just be IN?

Oh, this is a silly leftover from the CBC code as it has multiple inputs.

#define IN %xmm8 then, s/IN1/IN/g

+
+#define AREG %rax

Shouldn't %rax just be hardcoded?

I thought this (or any other) renaming helps to read. Maybe I'm missing something. Can I get to know your thought on this?

+#define HANDLEP %rdi

This should be called CTX, to match the function prototypes.

+#define UKEYP OUTP

This should be called IN_KEY, to match the function prototypes.

Okay. But, OTOH, the prototype itself is somewhat generic. Then its argument naming does not always match with what is supposed to be meant in the code. Thus, AES-NI renamed those like

ctx -> KEYP
in_key -> UKEY
...

So, another option can be leaving some comments there, e.g. '# ctx is renamed to KEYP'.

+
+.Lsetkey_end:
+ movdqu STATE1, (HANDLEP)
+ movdqu STATE2, 0x10(HANDLEP)
+ movdqu STATE3, 0x20(HANDLEP)

The moves to the ctx should use movdqa, since it is aligned.

Reading the manual, the difference is whether generating #GP or not when any misaligned memory operand comes. Then, MOVDQA all here seems to be saying please check the alignment every time.

But, HANDLEP is known to have an aligned address. Then, the plain move seems to be enough and coherent with the glue code -- avoid unnecessary sanity checks.

+
+ xor AREG, AREG
+ FRAME_END
+ RET
+SYM_FUNC_END(__aeskl_setkey)

This function always returns 0, so it really should return void.

Yeah, fair enough.

In the common case (successful AES-256 encryption) this is executing 'jmp'
twice. I think the code should be rearranged to eliminate these jmps.

Ah, right. I think a good point! Let me tweak this for those most likely cases.

__aeskl_xts_encrypt() and __aeskl_xts_decrypt() are very similar. To reduce
code duplication, can you consider generating them from a macro that takes an
argument that indicates whether it is encrypt or decrypt?

Yeah, I can see the code that prepares operands is common between them. But, I'm not sure folding them together can make it more readable.

Something that your AES-KL code does that's a bit ugly is that it abuses
'struct crypto_aes_ctx' to store a Keylocker key handle instead
of the actual AES key schedule which the struct is supposed to be for.

The proper way to represent that would be to make the tfm context for
xts-aes-aeskl be a union of crypto_aes_ctx and a Keylocker specific context.

Agreed. I think this is likely the fallout of that struct aesni_xts_ctx fix. Previously, the field was a byte array which itself is not necessarily representing the extended-key format. Now the fix changed it to be more specific. Accordingly, Key Locker has to specify it.

Thanks,
Chang