RE: [RFC V1 1/5] swiotlb: Support allocating DMA memory from SWIOTLB

From: Michael Kelley
Date: Thu Feb 15 2024 - 15:26:58 EST


From: Alexander Graf <graf@xxxxxxxxxx> Sent: Thursday, February 15, 2024 1:44 AM
>
> On 15.02.24 04:33, Vishal Annapurve wrote:
> > On Wed, Feb 14, 2024 at 8:20 PM Kirill A. Shutemov
> <kirill@xxxxxxxxxxxxx> wrote:
> >> On Fri, Jan 12, 2024 at 05:52:47AM +0000, Vishal Annapurve wrote:
> >>> Modify SWIOTLB framework to allocate DMA memory always from SWIOTLB.
> >>>
> >>> CVMs use SWIOTLB buffers for bouncing memory when using dma_map_* APIs
> >>> to setup memory for IO operations. SWIOTLB buffers are marked as shared
> >>> once during early boot.
> >>>
> >>> Buffers allocated using dma_alloc_* APIs are allocated from kernel memory
> >>> and then converted to shared during each API invocation. This patch ensures
> >>> that such buffers are also allocated from already shared SWIOTLB
> >>> regions. This allows enforcing alignment requirements on regions marked
> >>> as shared.
> >> But does it work in practice?
> >>
> >> Some devices (like GPUs) require a lot of DMA memory. So with this approach
> >> we would need to have huge SWIOTLB buffer that is unused in most VMs.
> >>
> > Current implementation limits the size of statically allocated SWIOTLB
> > memory pool to 1G. I was proposing to enable dynamic SWIOTLB for CVMs
> > in addition to aligning the memory allocations to hugepage sizes, so
> > that the SWIOTLB pool can be scaled up on demand.
> >

Vishal --

When the dynamic swiotlb mechanism tries to grow swiotlb space
by adding another pool, it gets the additional memory as a single
physically contiguous chunk using alloc_pages(). It starts by trying
to allocate a chunk the size of the original swiotlb size, and if that
fails, halves the size until it gets a size where the allocation succeeds.
The minimum size is 1 Mbyte, and if that fails, the "grow" fails.

So it seems like dynamic swiotlb is subject to the almost the same
memory fragmentation limitations as trying to allocate huge pages.
swiotlb needs a minimum of 1 Mbyte contiguous in order to grow,
while huge pages need 2 Mbytes, but either is likely to be
problematic in a VM that has been running a while. With that
in mind, I'm not clear on the benefit of enabling dynamic swiotlb.
It seems like it just moves around the problem of needing high order
contiguous memory allocations. Or am I missing something?

Michael

> > The issue with aligning the pool areas to hugepages is that hugepage
> > allocation at runtime is not guaranteed. Guaranteeing the hugepage
> > allocation might need calculating the upper bound in advance, which
> > defeats the purpose of enabling dynamic SWIOTLB. I am open to
> > suggestions here.
>
>
> You could allocate a max bound at boot using CMA and then only fill into
> the CMA area when SWIOTLB size requirements increase? The CMA region
> will allow movable allocations as long as you don't require the CMA space.
>
>
> Alex