Re: [PATCH v1] ALSA: memalloc: Fix indefinite hang in non-iommu case
From: Hillf Danton
Date: Wed Feb 14 2024 - 22:45:59 EST
On Wed, 14 Feb 2024 17:07:25 -0700 Karthikeyan Ramasubramanian <kramasub@xxxxxxxxxxxx>
> Before 9d8e536 ("ALSA: memalloc: Try dma_alloc_noncontiguous() at first")
> the alsa non-contiguous allocator always called the alsa fallback
> allocator in the non-iommu case. This allocated non-contig memory
> consisting of progressively smaller contiguous chunks. Allocation was
> fast due to the OR-ing in of __GFP_NORETRY.
>
> After 9d8e536 ("ALSA: memalloc: Try dma_alloc_noncontiguous() at first")
> the code tries the dma non-contig allocator first, then falls back to
> the alsa fallback allocator. In the non-iommu case, the former supports
> only a single contiguous chunk.
>
> We have observed experimentally that under heavy memory fragmentation,
> allocating a large-ish contiguous chunk with __GFP_RETRY_MAYFAIL
> triggers an indefinite hang in the dma non-contig allocator. This has
> high-impact, as an occurrence will trigger a device reboot, resulting in
> loss of user state.
>
> Fix the non-iommu path by letting dma_alloc_noncontiguous() fail quickly
> so it does not get stuck looking for that elusive large contiguous chunk,
> in which case we will fall back to the alsa fallback allocator.
The faster dma_alloc_noncontiguous() fails the more likely the paperover
in 9d8e536d36e7 fails to work, so this is another case of bandaid instead
of mitigating heavy fragmentation at the first place.