Re: [PATCH] mm: page_alloc: Assume huge tail pages are valid when allocating contiguous pages

From: Mike Kravetz
Date: Fri Apr 14 2023 - 15:07:02 EST


Thanks Mel!
Apologies for not noticing when the bug was posted to the list. Otherwise,
I would have jumped on it.

On 04/14/23 09:22, Mel Gorman wrote:
> A bug was reported by Yuanxi Liu where allocating 1G pages at runtime is
> taking an excessive amount of time for large amounts of memory. Further
> testing allocating huge pages that the cost is linear i.e. if allocating
> 1G pages in batches of 10 then the time to allocate nr_hugepages from
> 10->20->30->etc increases linearly even though 10 pages are allocated at
> each step. Profiles indicated that much of the time is spent checking the
> validity within already existing huge pages and then attempting a migration
> that fails after isolating the range, draining pages and a whole lot of
> other useless work.
>
> Commit eb14d4eefdc4 ("mm,page_alloc: drop unnecessary checks from
> pfn_range_valid_contig") removed two checks, one which ignored huge pages
> for contiguous allocations as huge pages can migrate. While there may be
> value on migrating a 2M page to satisfy a 1G allocation, it's pointless
> to move a 1G page for a new 1G allocation or scan for validity within an
> existing huge page.

eb14d4eefdc4 was the last patch in Oscar's series "Make alloc_contig_range
handle Hugetlb pages".
https://lore.kernel.org/linux-mm/20210419075413.1064-1-osalvador@xxxxxxx/

It seems bailing out of alloc_contig_range when experiencing hugetlb
pages was an actual issue as the cover letter says:

" This can be problematic for some users, e.g: CMA and virtio-mem, where those
users will fail the call if alloc_contig_range ever sees a HugeTLB page, even
when those pages lay in ZONE_MOVABLE and are free.
That problem can be easily solved by replacing the page in the free hugepage
pool."

> Reintroduce the PageHuge check with some limitations. The new check
> will allow an attempt to migrate a 2M page for a 1G allocation but
> contiguous requests within CMA regions will always attempt the migration.
> The function is renamed as pfn_range_valid_contig can be easily confused
> with a pfn_valid() check which is not what the function does.

Michal suggested that we reintroduce the simple PageHuge check and fail
alloc_contig_range if any hugetlb pages are encountered. However, this
certainly would impact the issue Oscar was trying to address.

Do note that Oscar's series actually tried to address the issue here. In
isolate_or_dissolve_huge_page() there is this check.

/*
* Fence off gigantic pages as there is a cyclic dependency between
* alloc_contig_range and them. Return -ENOMEM as this has the effect
* of bailing out right away without further retrying.
*/
if (hstate_is_gigantic(h))
return -ENOMEM;

So we actually bail when getting here. But we are well down the call chain:
alloc_contig_pages() -> alloc_contig_range() -> __alloc_contig_migrate_range()
-> isolate_migratepages_range() -> isolate_migratepages_block() ->
isolate_or_dissolve_huge_page

The overhead of getting to this point where we bail in
isolate_or_dissolve_huge_page is excessive.

I suggest reintroducing the PageHuge check with condition that it will allow
migration of non-gigantic pages. This will at least address the issue seen
by Oscar. I agree with others that the CMA check is not needed.

> + if (PageHuge(page) && compound_nr(page) >= nr_pages &&

I 'think' some architectures have more then one gigantic huge page size. So,
it seems this should really be a gigantic page check. Otherwise we will bail
later in isolate_or_dissolve_huge_page as we do today. However, we can not
use the hugetlb interfaces to check for a gigantic page as they require the
page to be 'stable' and we hold no locks. So, it would need to be open
coded something like:

if (PageHuge(page) && compound_order(page) > MAX_ORDER)
/* after MAX_ORDER update in linux-next */

--
Mike Kravetz