Re: [PATCH] mm/vmalloc: do not output a spurious warning when huge vmalloc() fails

From: Lorenzo Stoakes
Date: Tue Jun 06 2023 - 03:42:47 EST


On Tue, Jun 06, 2023 at 09:13:24AM +0200, Vlastimil Babka wrote:
>
> On 6/5/23 22:11, Lorenzo Stoakes wrote:
> > In __vmalloc_area_node() we always warn_alloc() when an allocation
> > performed by vm_area_alloc_pages() fails unless it was due to a pending
> > fatal signal.
> >
> > However, huge page allocations instigated either by vmalloc_huge() or
> > __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or
> > kvmalloc_node()) always falls back to order-0 allocations if the huge page
> > allocation fails.
> >
> > This renders the warning useless and noisy, especially as all callers
> > appear to be aware that this may fallback. This has already resulted in at
> > least one bug report from a user who was confused by this (see link).
> >
> > Therefore, simply update the code to only output this warning for order-0
> > pages when no fatal signal is pending.
> >
> > Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410
> > Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
>
> I think there are more reports of same thing from the btrfs context, that
> appear to be a 6.3 regression
>
> https://bugzilla.kernel.org/show_bug.cgi?id=217466
> Link: https://lore.kernel.org/all/efa04d56-cd7f-6620-bca7-1df89f49bf4b@xxxxxxxxx/
>
> If this indeed helps, it would make sense to Cc: stable here. Although I
> don't see what caused the regression, the warning itself is not new, so is
> it new source of order-9 attempts in vmalloc() or new reasons why order-9
> pages would not be possible to allocate?

Linus updated kvmalloc() to use huge vmalloc() allocations in 9becb6889130
("kvmalloc: use vmalloc_huge for vmalloc allocations") and Song update
alloc_large_system_hash() to as well in f2edd118d02d ("page_alloc: use
vmalloc_huge for large system hash") both of which are ~1y old, however
these would impact ~5.18, so it's weird to see reports citing 6.2 -> 6.3.

Will dig to see if something else changed that would increase the
prevalence of this.

Also while we're here, ugh at us immediately splitting the non-compound
(also ugh) huge page. Nicholas explains why in the patch that introduces it
- 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split
rather than compound") - but it'd be nice if we could find a way to avoid
this.

If only there were a data type (perhaps beginning with 'f') that abstracted
the order of the page entirely and could be guaranteed to always be the one
with which you manipulated ref count, etc... ;)

>
> > ---
> > mm/vmalloc.c | 17 +++++++++++++----
> > 1 file changed, 13 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index ab606a80f475..e563f40ad379 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3149,11 +3149,20 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > * allocation request, free them via vfree() if any.
> > */
> > if (area->nr_pages != nr_small_pages) {
> > - /* vm_area_alloc_pages() can also fail due to a fatal signal */
> > - if (!fatal_signal_pending(current))
> > + /*
> > + * vm_area_alloc_pages() can fail due to insufficient memory but
> > + * also:-
> > + *
> > + * - a pending fatal signal
> > + * - insufficient huge page-order pages
> > + *
> > + * Since we always retry allocations at order-0 in the huge page
> > + * case a warning for either is spurious.
> > + */
> > + if (!fatal_signal_pending(current) && page_order == 0)
> > warn_alloc(gfp_mask, NULL,
> > - "vmalloc error: size %lu, page order %u, failed to allocate pages",
> > - area->nr_pages * PAGE_SIZE, page_order);
> > + "vmalloc error: size %lu, failed to allocate pages",
> > + area->nr_pages * PAGE_SIZE);
> > goto fail;
> > }
> >
>