Re: [RFC PATCH 0/4] crypto: add CRYPTO_TFM_REQ_DMA flag

From: Iuliana Prodan
Date: Thu Nov 26 2020 - 13:21:35 EST


On 11/26/2020 7:12 PM, Ard Biesheuvel wrote:
On Thu, 26 Nov 2020 at 17:00, Iuliana Prodan <iuliana.prodan@xxxxxxx> wrote:

On 11/26/2020 9:09 AM, Ard Biesheuvel wrote:
On Wed, 25 Nov 2020 at 22:39, Iuliana Prodan <iuliana.prodan@xxxxxxx> wrote:

On 11/25/2020 11:16 PM, Ard Biesheuvel wrote:
On Wed, 25 Nov 2020 at 22:14, Iuliana Prodan (OSS)
<iuliana.prodan@xxxxxxxxxxx> wrote:

From: Iuliana Prodan <iuliana.prodan@xxxxxxx>

Add the option to allocate the crypto request object plus any extra space
needed by the driver into a DMA-able memory.

Add CRYPTO_TFM_REQ_DMA flag to be used by backend implementations to
indicate to crypto API the need to allocate GFP_DMA memory
for private contexts of the crypto requests.


These are always directional DMA mappings, right? So why can't we use
bounce buffering here?

The idea was to avoid allocating any memory in crypto drivers.
We want to be able to use dm-crypt with CAAM, which needs DMA-able
memory and increasing reqsize is not enough.

But what does 'needs DMA-able memory' mean? DMA operations are
asynchronous by definition, and so the DMA layer should be able to
allocate bounce buffers when needed. This will cost some performance
in cases where the hardware cannot address all of memory directly, but
this is a consequence of the design, and I don't think we should
burden the generic API with this.

Ard, I believe you're right.

In CAAM, for req->src and req->dst, which comes from crypto request, we
use DMA mappings without knowing if the memory is DMAable or not.

We should do the same for CAAM's hw descriptors commands and link
tables. That's the extra memory allocated by increasing reqsize.


It depends on whether any such mappings are non-directional. But I
would not expect per-request mappings to be modifiable by both the CPU
and the device at the same time.

There are bidirectional mappings on req->src (if it's also used for output) and IV (if exits).
But, these are not modify by CPU and CAAM at the same time.


Horia, do you see any limitations, in CAAM, for not using the above
approach?


It started from here
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-crypto%2F71b6f739-d4a8-8b26-bf78-ce9acf9a0f99%40nxp.com%2FT%2F%23m39684173a2f0f4b83d8bcbec223e98169273d1e4&amp;data=04%7C01%7Ciuliana.prodan%40nxp.com%7Cfdd8e587f49f44821e6d08d8922e8ca9%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637420075916446952%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=x2G4kaWiKVVcOie2yC8JwOpDnPsa3OPO6HpfThqXChE%3D&amp;reserved=0

For IPsec use cases, CRYPTO_TFM_REQ_DMA flag is also checked in
esp_alloc_tmp() function for IPv4 and IPv6.

This series includes an example of how a driver can use
CRYPTO_TFM_REQ_DMA flag while setting reqsize to a larger value
to avoid allocating memory at crypto request runtime.
The extra size needed by the driver is added to the reqsize field
that indicates how much memory could be needed per request.

Iuliana Prodan (4):
crypto: add CRYPTO_TFM_REQ_DMA flag
net: esp: check CRYPTO_TFM_REQ_DMA flag when allocating crypto request
crypto: caam - avoid allocating memory at crypto request runtime for
skcipher
crypto: caam - avoid allocating memory at crypto request runtime for
aead

drivers/crypto/caam/caamalg.c | 130 +++++++++++++++++++++++++---------
include/crypto/aead.h | 4 ++
include/crypto/akcipher.h | 21 ++++++
include/crypto/hash.h | 4 ++
include/crypto/skcipher.h | 4 ++
include/linux/crypto.h | 1 +
net/ipv4/esp4.c | 7 +-
net/ipv6/esp6.c | 7 +-
8 files changed, 144 insertions(+), 34 deletions(-)

--
2.17.1