Re: [PATCH v1] ALSA: memalloc: Fix indefinite hang in non-iommu case

From: Takashi Iwai
Date: Fri Feb 16 2024 - 09:43:21 EST


On Fri, 16 Feb 2024 13:19:54 +0100,
Kai Vehmanen wrote:
>
> Hi,
>
> On Fri, 16 Feb 2024, Takashi Iwai wrote:
>
> > On Fri, 16 Feb 2024 09:35:32 +0100, Takashi Iwai wrote:
> > > The fact that we have to drop __GFP_RETRY_MAYFAIL indicates that the
> > > handling there doesn't suffice -- at least for the audio operation.
> >
> > Reconsidering on this again, I wonder keeping __GFP_RETRY_MAYFAIL
> > makes sense. We did have __GFP_NORETRY for avoiding OOM-killer.
> > But it's been over ages, and the memory allocation core became smart
> > enough.
> >
> > The side-effect of __GFP_RETRY_MAYFAIL is that the page reclaim and
> > compaction happens even for high-order allocations, and that must be
>
> for the original problem that led to "ALSA: memalloc: use
> __GFP_RETRY_MAYFAIL for DMA mem allocs", reclaim for low-order case
> would be enough. I.e. the case was:
>
> > OTOH, a slight concern with the drop of __GFP_RETRY_MAYFAIL is whether
> > allowing OOM-killer for low order allocations is acceptable or not.
> >
> > There are two patterns of calling allocators:
> [..]
> > 3. SNDRV_DMA_TYPE_NONCONTIG for large size:
> > this is called often, once per stream open, since the driver
> > doesn't keep the buffer.
>
> So with SOF we have additional case where we do an allocation for the DSP
> firmware (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_SG, ...)) and this is
> called at system resume. With s/__GPF_RETRY_MAYFAIL/__GFP_NORETRY/, these
> allocations failed (on a iommu enabled Chromebook) at system resume in a
> case where system was not really running out of memory (reclaim was
> possible). A failed allocation means there's no audio in the system after
> resume, so we want to try harder.
>
> But yeah, I think the proposed handling for (3) category would work. If
> needed, we can further specialize the DSP firmware case with some hint
> to snd_dma_alloc_pages().

OK, then how about the one like below?

This changes:
- Back to __GFP_NORETRY as default
- Use __GFP_RETRY_MAYFAIL for SNDRV_DMA_TYPE_NONCONTIG with IOMMU;
this should cover the commit a61c7d88d38c
- Also use __GFP_RETRY_MAYFAIL for the SG-fallback allocations of the
minimal order, just like IOMMU allocator does.

This should be less destructive, while still allowing more aggressive
allocations for SG buffers.


thanks,

Takashi

--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -21,9 +21,13 @@

#define DEFAULT_GFP \
(GFP_KERNEL | \
- __GFP_RETRY_MAYFAIL | /* don't trigger OOM-killer */ \
+ __GFP_NORETRY | /* don't trigger OOM-killer */ \
__GFP_NOWARN) /* no stack trace print - this call is non-critical */

+/* GFP flags to be used for low order pages, allowing reclaim and compaction */
+#define DEFAULT_GFP_RETRY \
+ (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
+
static const struct snd_malloc_ops *snd_dma_get_ops(struct snd_dma_buffer *dmab);

#ifdef CONFIG_SND_DMA_SGBUF
@@ -281,7 +285,11 @@ static void *do_alloc_pages(struct device *dev, size_t size, dma_addr_t *addr,
bool wc)
{
void *p;
- gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
+ gfp_t gfp = DEFAULT_GFP;
+
+ /* allow reclaim and compaction for low order pages */
+ if (size <= PAGE_SIZE)
+ gfp = DEFAULT_GFP_RETRY;

again:
p = alloc_pages_exact(size, gfp);
@@ -539,14 +547,18 @@ static const struct snd_malloc_ops snd_dma_wc_ops = {
static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
{
struct sg_table *sgt;
+ gfp_t gfp = DEFAULT_GFP;
void *p;

#ifdef CONFIG_SND_DMA_SGBUF
if (cpu_feature_enabled(X86_FEATURE_XENPV))
return snd_dma_sg_fallback_alloc(dmab, size);
+ /* with IOMMU, it's safe to pass __GFP_RETRY_MAYFAIL with high order */
+ if (get_dma_ops(dmab->dev.dev))
+ gfp = DEFAULT_GFP_RETRY;
#endif
sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
- DEFAULT_GFP, 0);
+ gfp, 0);
#ifdef CONFIG_SND_DMA_SGBUF
if (!sgt && !get_dma_ops(dmab->dev.dev))
return snd_dma_sg_fallback_alloc(dmab, size);