Re: [PATCH v1] ALSA: memalloc: Fix indefinite hang in non-iommu case

From: Takashi Iwai
Date: Thu Feb 15 2024 - 12:13:57 EST


On Thu, 15 Feb 2024 17:07:38 +0100,
Sven van Ashbrook wrote:
>
> Hi Takashi,
>
> On Thu, Feb 15, 2024 at 3:40 AM Takashi Iwai <tiwai@xxxxxxx> wrote:
> >
> > Yes, the main problem is the indefinite hang from
> > dma_alloc_noncontiguous().
>
> We have a publicly-visible test [1] which readily triggers the
> indefinite hang on non-iommu Intel SoCs such as JasperLake.
> As noted in the commit message, iommu SoCs are not currently
> affected.
>
> > So, is the behavior more or less same even if you pass
> > __GFP_RETRY_MAYFAIL to dma_alloc_noncontiguous()? Or is this flag
> > already implicitly set somewhere in the middle? It shouldn't hang
> > indefinitely, but the other impact to the system like OOM-killer
> > kickoff may be seen.
>
> My incomplete understanding:
>
> Alsa specifies __GFP_RETRY_MAYFAIL because it wants to prevent triggering
> the OOM killer.

Ah I forgot that we set that bit commonly in the flag.

> This was __GFP_NORETRY in the not-too-distant past [2],
> but that failed too quickly, which resulted in permanent loss of audio due
> to failed firmware dma sg allocations.

Hm, the change in commit a61c7d88d38c assumed that __GFP_RETRY_MAYFAIL
shouldn't have that big impact. If the hang really happens with a
high order allocation, it's dangerous to use it in other code
allocations than noncontiguous case (i.e. SNDRV_DMA_TYPE_DEV and co).
In the tight memory situation, a similar problem can be triggered
quite easily, then.

> In the iommu case, dma_alloc_noncontiguous() implements a backoff [3] loop
> which ORs in __GFP_NORETRY except for minimum order allocations. We observe
> experimentally that __GFP_RETRY_MAYFAIL does not get "stuck" on minimum order
> allocations. So the iommu case is not affected.
>
> In the non-iommu case however, dma_alloc_noncontiguous() actually becomes a
> contiguous allocator, with no backoff loop. The commit introducing it [4]
> states "This API is only properly implemented for dma-iommu and will simply
> return a contigious chunk as a fallback." In this case we observe the indefinite
> hang.
>
> The alsa fallback allocator is also not affected by the problem, as it does
> not specify __GFP_RETRY_MAYFAIL. Except in the XENPV case.

So it sounds like that we should go back for __GFP_NORETRY in general
for non-zero order allocations, not only the call you changed, as
__GFP_RETRY_MAYFAIL doesn't guarantee the stuck.

How about the changes like below?

And, Kai, could you verify whether this change with the recent kernel
code won't give significant regressions for the issues you tried to
address with commit a61c7d88d38c?


thanks,

Takashi

--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -19,17 +19,22 @@
#include <sound/memalloc.h>
#include "memalloc_local.h"

-#define DEFAULT_GFP \
- (GFP_KERNEL | \
- __GFP_RETRY_MAYFAIL | /* don't trigger OOM-killer */ \
- __GFP_NOWARN) /* no stack trace print - this call is non-critical */
-
static const struct snd_malloc_ops *snd_dma_get_ops(struct snd_dma_buffer *dmab);

#ifdef CONFIG_SND_DMA_SGBUF
static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size);
#endif

+/* default GFP bits for our allocations */
+static gfp_t default_gfp(size_t size)
+{
+ /* don't allocate intensively for high-order pages */
+ if (size > PAGE_SIZE)
+ return GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
+ else
+ return GFP_KERNEL | __GFP_NOWARN | __GFP_RETRY_MAYFAIL;
+}
+
static void *__snd_dma_alloc_pages(struct snd_dma_buffer *dmab, size_t size)
{
const struct snd_malloc_ops *ops = snd_dma_get_ops(dmab);
@@ -281,7 +286,7 @@ static void *do_alloc_pages(struct device *dev, size_t size, dma_addr_t *addr,
bool wc)
{
void *p;
- gfp_t gfp = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
+ gfp_t gfp = default_gfp(size);

again:
p = alloc_pages_exact(size, gfp);
@@ -466,7 +471,7 @@ static const struct snd_malloc_ops snd_dma_iram_ops = {
*/
static void *snd_dma_dev_alloc(struct snd_dma_buffer *dmab, size_t size)
{
- return dma_alloc_coherent(dmab->dev.dev, size, &dmab->addr, DEFAULT_GFP);
+ return dma_alloc_coherent(dmab->dev.dev, size, &dmab->addr, default_gfp(size));
}

static void snd_dma_dev_free(struct snd_dma_buffer *dmab)
@@ -511,7 +516,7 @@ static int snd_dma_wc_mmap(struct snd_dma_buffer *dmab,
#else
static void *snd_dma_wc_alloc(struct snd_dma_buffer *dmab, size_t size)
{
- return dma_alloc_wc(dmab->dev.dev, size, &dmab->addr, DEFAULT_GFP);
+ return dma_alloc_wc(dmab->dev.dev, size, &dmab->addr, default_gfp(size));
}

static void snd_dma_wc_free(struct snd_dma_buffer *dmab)
@@ -546,7 +551,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
return snd_dma_sg_fallback_alloc(dmab, size);
#endif
sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
- DEFAULT_GFP, 0);
+ default_gfp(size), 0);
#ifdef CONFIG_SND_DMA_SGBUF
if (!sgt && !get_dma_ops(dmab->dev.dev))
return snd_dma_sg_fallback_alloc(dmab, size);
@@ -783,7 +788,7 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
while (size > 0) {
chunk = min(size, chunk);
if (sgbuf->use_dma_alloc_coherent)
- p = dma_alloc_coherent(dmab->dev.dev, chunk, &addr, DEFAULT_GFP);
+ p = dma_alloc_coherent(dmab->dev.dev, chunk, &addr, default_gfp(size));
else
p = do_alloc_pages(dmab->dev.dev, chunk, &addr, false);
if (!p) {
@@ -871,7 +876,7 @@ static void *snd_dma_noncoherent_alloc(struct snd_dma_buffer *dmab, size_t size)
void *p;

p = dma_alloc_noncoherent(dmab->dev.dev, size, &dmab->addr,
- dmab->dev.dir, DEFAULT_GFP);
+ dmab->dev.dir, default_gfp(size));
if (p)
dmab->dev.need_sync = dma_need_sync(dmab->dev.dev, dmab->addr);
return p;