Re: [PATCH] crypto: gcm - fix cacheline sharing

From: Horia Geanta
Date: Thu May 30 2019 - 03:50:48 EST


On 5/30/2019 8:34 AM, Herbert Xu wrote:
> On Wed, May 29, 2019 at 01:27:28PM -0700, Eric Biggers wrote:
>>
>> So what about the other places that also pass an IV located next to the data,
>> like crypto/ccm.c and crypto/adiantum.c? If we're actually going to make this a
Fix for ccm is WIP.
We were not aware of adiantum since our crypto engine does not accelerate it.

>> new API requirement, then we need to add a debugging option that makes the API
>> detect this violation so that the other places can be fixed too.
>>
IMO this is not a new crypto API requirement.
crypto API and its users must follow DMA API rules, besides crypto-specific ones.

In this particular case, crypto/gcm.c is both an implementation and a crypto API
user, since it uses underneath ctr(aes) (and ghash).
Currently generic gcm implementation is breaking DMA API, since part of the dst
buffer (auth_tag) provided to ctr(aes) is sharing a cache line with some other
data structure (iv).

The DMA API rule is mentioned in Documentation/DMA-API.txt

.. warning::

Memory coherency operates at a granularity called the cache
line width. In order for memory mapped by this API to operate
correctly, the mapped region must begin exactly on a cache line
boundary and end exactly on one (to prevent two separately mapped
regions from sharing a single cache line).


>> Also, doing a kmalloc() per requset is inefficient and very error-prone. In
>> fact there are at least 3 bugs here: (1) not checking the return value, (2)
>> incorrectly using GFP_KERNEL when it may be atomic context, and (3) not always
For (2) I assume this means checking for CRYPTO_TFM_REQ_MAY_SLEEP flag.

>> freeing the memory. Why not use cacheline-aligned memory within the request
>> context, so that a separate kmalloc() isn't needed?
>>
If you check previous discussion referenced in the commit message:
Link:
https://lore.kernel.org/linux-crypto/20190208114459.5nixe76xmmkhur75@xxxxxxxxxxxxxxxxxxx/

or (probably easier) look at the full thread:
https://patchwork.kernel.org/patch/10789697/

you'll see that at some point I proposed changing crypto_gcm_req_priv_ctx struct
as follows:
- u8 auth_tag[16];
+ u8 auth_tag[16] ____cacheline_aligned;

Ard suggested it would be better to kmalloc the auth_tag.

I am open to changing the fix, however I don't think the problem is in the
implementation (caam driver).

>> Also, did you consider whether there's any way to make the crypto API handle
>> this automatically, so that all the individual users don't have to?
That would probably work, but I guess it would come up with a big overhead.

I am thinking crypto API would have to check each buffer used by src, dst
scatterlists is correctly aligned - starting and ending on cache line boundaries.

>
> You're absolutely right Eric.
>
> What I suggested in the old thread is non-sense. While you can
> force GCM to provide the right pointers you cannot force all the
> other crypto API users to do this.
>
Whose problem is that crypto API users don't follow the DMA API requirements?

> It would appear that Ard's latest suggestion should fix the problem
> and is the correct approach.
>
I disagree.
We shouldn't force crypto implementations to be aware of such inconsistencies in
the I/O data buffers (pointed to by src/dst scatterlists) that are supposed to
be safely DMA mapped.

Thanks,
Horia