Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

From: Horia Geantă
Date: Thu Jun 10 2021 - 10:52:16 EST


On 6/8/2021 5:35 AM, Dominique MARTINET wrote:
> I'm not able to find any individual mails for Christoph's patches so
> replying to the PR.
>
The patch set is here:
https://lore.kernel.org/linux-iommu/20210207160327.2955490-1-hch@xxxxxx

> In particular, this commit:
> Konrad Rzeszutek Wilk wrote on Fri, Feb 26, 2021 at 11:00:08AM -0500:
>> Christoph Hellwig (8):
>> swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single
>
> merged as 16fc3cef33a0, breaks caam as used today (thanks Lukas for
> bisecting it!)
>
Thanks.

I've noticed the failure also in v5.10 and v5.11 stable kernels,
since the patch set has been backported.

>
> More precisely, drivers/crypto/caam/caamalg.c does a lot of mappings
> that aren't aligned:
>
> dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma,
> desc_bytes(desc), ctx->dir);
> dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma,
> desc_bytes(desc), ctx->dir);
>
Right. These dma sync ops are in caaamalg.c and should be fixed.

OTOH there are other dma sync ops in caam driver - e.g. caamhash.c:
dma_sync_single_for_device(jrdev, ctx->sh_desc_update_dma,
desc_bytes(desc), ctx->dir);
where the mappings are aligned (see struct caam_hash_ctx),
but even in this case the crypto algorithms are failing.

>
> which can be caught by crypto tests with this caam enabled, for example
> adding a warning when an unaligned mapping happens I get this trace:
> --------
> [ 1628.670226] swiotlb_tbl_sync_single+0x74/0xa0
> [ 1628.674677] dma_sync_single_for_device+0xe4/0x110
> [ 1628.679472] skcipher_setkey+0xd0/0xf0
> [ 1628.683224] des3_skcipher_setkey+0x74/0xac
> [ 1628.687416] crypto_skcipher_setkey+0x54/0x110
> [ 1628.691866] crypto_authenc_setkey+0x94/0xd0
> [ 1628.696138] crypto_aead_setkey+0x34/0x10c
> [ 1628.700236] test_aead_vec_cfg+0x3a0/0x770
> [ 1628.704338] test_aead+0xac/0x130
> [ 1628.707656] alg_test_aead+0xa8/0x190
> [ 1628.711324] alg_test.part.0+0xf4/0x41c
> [ 1628.715161] alg_test+0x1c/0x60
> [ 1628.718307] do_test+0x37ec/0x4c50
> [ 1628.721709] do_test+0x4bec/0x4c50
> [ 1628.725114] tcrypt_mod_init+0x54/0xac
> [ 1628.728864] do_one_initcall+0x4c/0x1b0
> [ 1628.732701] kernel_init_freeable+0x1d0/0x234
> [ 1628.737060] kernel_init+0x10/0x114
> [ 1628.740550] ret_from_fork+0x10/0x24
> -----
>
> and the tests themselves also fail (all or at least most of them) with
> e.g.
> ------
> [ 8.454233] caam_jr 30901000.jr: 40001713: DECO: desc idx 23: Header Error. Invalid length or parity, or certain other problems.
> [ 8.465820] alg: ahash: hmac-sha256-caam final() failed with err -22 on test vector 0, cfg="init+update+final aligned buffer"
> [ 8.477149] ------------[ cut here ]------------
> [ 8.481781] alg: self-tests for hmac-sha256-caam (hmac(sha256)) failed (rc=-22)
> [ 8.481818] WARNING: CPU: 2 PID: 295 at crypto/testmgr.c:5645 alg_test.part.0+0x128/0x41c
> [ 8.497307] Modules linked in:
> [ 8.500365] CPU: 2 PID: 295 Comm: cryptomgr_test Tainted: G W 5.13.0-rc5-00002-gc98cdee6172e #23
> [ 8.510455] Hardware name: NXP i.MX8MPlus EVK board (DT)
> [ 8.515767] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> [ 8.521778] pc : alg_test.part.0+0x128/0x41c
> [ 8.526050] lr : alg_test.part.0+0x128/0x41c
> [ 8.530324] sp : ffff80001371bd10
> [ 8.533637] x29: ffff80001371bd10 x28: 000000000000008f x27: 000000000000008f
> [ 8.540785] x26: 000000000000008f x25: 0000000000000400 x24: ffff8000111658c8
> [ 8.547930] x23: ffff0000c02aaa80 x22: 000000000001008f x21: ffff0000c02aaa00
> [ 8.555075] x20: 0000000000000085 x19: 00000000ffffffea x18: 00000000fffffffc
> [ 8.562221] x17: 0000000000000001 x16: 0000000000000003 x15: 0000000000000020
> [ 8.569365] x14: ffffffffffffffff x13: 00000000000003e7 x12: ffff80001371b9e0
> [ 8.576511] x11: ffff80001188c940 x10: ffff800011844300 x9 : ffff800011886b98
> [ 8.583658] x8 : ffff80001182eb98 x7 : ffff800011886b98 x6 : ffffffffffff0888
> [ 8.590801] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
> [ 8.597945] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000c1684e00
> [ 8.605093] Call trace:
> [ 8.607540] alg_test.part.0+0x128/0x41c
> [ 8.611467] alg_test+0x1c/0x60
> [ 8.614608] cryptomgr_test+0x28/0x50
> [ 8.618275] kthread+0x154/0x160
> [ 8.621511] ret_from_fork+0x10/0x24
> [ 8.625088] ---[ end trace 2d195377ee3c219e ]---
> ------
>
>
>
> Looking at it a bit further it seems to me that swiotlb_bounce() should
> either keep the offset (re-adding the line that was removed except it
> would go back in swiotlb_bounce, diff at end of mail), or the size
> should be adjusted to cover from the start of the page up until the
> original offset + size which would also probably work (not tested)
>
> That, or make unaligned mappings forbidden and warn when we see one, but
> I have no idea what other component could be doing some -- I'm not sure
> if what the caam code does it legitimate (e.g. would it be possible to
> do the mappings once at init and use them?), but the swiotlb code
> doesn't look quite right.
>
Well, it's not only about unaligned accesses.

It's also about partial syncs, e.g.
dma_handle = dma_map_single(dev, cpu_addr, size, DMA_BIDIRECTIONAL);
[...]
dma_sync_single_for_device(dev, dma_handle + offset, size - offset,
DMA_BIDIRECTIONAL);
(where dma_handle + offset should be cacheline-aligned).

Documentation/core-api/dma-api.rst explicitly allows for partial syncs:
Synchronise a single contiguous or scatter/gather mapping for the CPU
and device. With the sync_sg API, all the parameters must be the same
as those passed into the single mapping API. With the sync_single API,
you can use dma_handle and size parameters that aren't identical to
those passed into the single mapping API to do a partial sync.

AFAICS commit 16fc3cef33a0 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
is breaking this functionality.

Thanks,
Horia