Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN

From: Saravana Kannan
Date: Fri Oct 14 2022 - 16:23:46 EST


On Fri, Oct 14, 2022 at 9:25 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> On Thu, Oct 13, 2022 at 11:58:22AM -0700, Saravana Kannan wrote:
> > On Thu, Oct 13, 2022 at 9:57 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > > > If so, would there be concerns that the memory savings we get back from
> > > > reducing the memory footprint of kmalloc might be defeated by how much
> > > > memory is needed for bounce buffering?
> > >
> > > It's not necessarily about the saved memory but also locality of the
> > > small buffer allocations, less cache and TLB pressure.
> >
> > Part of the pushback we get when we try to move some of the Android
> > ecosystem from 32-bit to 64-bit is the memory usage increase. So,
> > while the main goal might not be memory savings, it'll be good to keep
> > that in mind too. I'd definitely not want this patch series to make
> > things worse. Ideally, it'd make things better. 10MB is considered a
> > lot on some of the super low speced devices.
>
> Well, we can still add the option to skip allocating from the small
> kmalloc caches if there's no swiotlb available.

This seems like a good compromise.

> > > I wonder whether swiotlb is actually the best option for bouncing
> > > unaligned buffers. We could use something like mempool_alloc() instead
> > > if we stick to small buffers rather than any (even large) buffer that's
> > > not aligned to a cache line. Or just go for kmem_cache_alloc() directly.
> > > A downside is that we may need GFP_ATOMIC for such allocations, so
> > > higher risk of failure.
> >
> > Yeah, a temporary kmem_cache_alloc() to bounce buffers off of feels
> > like a better idea than swiotlb. Especially for small allocations (say
> > 8 byte allocations) that might have gone into the kmem-cache-64 if we
> > hadn't dropped KMALLOC_MIN_ALIGN to 8.
>
> I now remembered why this isn't trivial. On the dma_unmap_*() side, we
> can't easily tell whether the buffer was bounced and it needs freeing.
> The swiotlb solves this by checking whether the address is within the
> pre-allocated swiotlb range. With kmem_caches, we could dig into the
> slab internals and check whether the pointer is part of a slab page,
> then check with kmem_cache that was. It looks too complicated and I'm
> rather tempted to just go for swiotlb if available or prevent the
> creation of small kmalloc caches if no swiotlb (TBH, even the initial
> series was an improvement dropping KMALLOC_MIN_ALIGN from 128 to 64).

Agreed. Even allowing a 64-byte kmalloc cache on a system with a
64-byte cacheline size saves quite a bit of memory.

-Saravana