Re: [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO

From: Mike Kravetz
Date: Mon Aug 28 2023 - 17:05:31 EST


On 08/28/23 19:33, Muchun Song wrote:
>
>
> > On Aug 25, 2023, at 19:18, Usama Arif <usama.arif@xxxxxxxxxxxxx> wrote:
> >
> > The new boot flow when it comes to initialization of gigantic pages
> > is as follows:
> > - At boot time, for a gigantic page during __alloc_bootmem_hugepage,
> > the region after the first struct page is marked as noinit.
> > - This results in only the first struct page to be
> > initialized in reserve_bootmem_region. As the tail struct pages are
> > not initialized at this point, there can be a significant saving
> > in boot time if HVO succeeds later on.
> > - Later on in the boot, HVO is attempted. If its successful, only the first
> > HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages
> > after the head struct page are initialized. If it is not successful,
> > then all of the tail struct pages are initialized.
> >
> > Signed-off-by: Usama Arif <usama.arif@xxxxxxxxxxxxx>
>
> This edition is simpler than before ever, thanks for your work.
>
> There is premise that other subsystems do not access vmemmap pages
> before the initialization of vmemmap pages associated withe HugeTLB
> pages allocated from bootmem for your optimization. However, IIUC, the
> compacting path could access arbitrary struct page when memory fails
> to be allocated via buddy allocator. So we should make sure that
> those struct pages are not referenced in this routine. And I know
> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter
> the same issue, but I don't find any code to prevent this from
> happening. I need more time to confirm this, if someone already knows,
> please let me know, thanks. So I think HugeTLB should adopt the similar
> way to prevent this.

In this patch, the call to hugetlb_vmemmap_optimize() is moved BEFORE
__prep_new_hugetlb_folio or prep_new_hugetlb_folio in all code paths.
The prep_new_hugetlb_folio routine(s) are what set the destructor (soon
to be a flag) that identifies the set of pages as a hugetlb page. So,
there is now a window where a set of pages not identified as hugetlb
will not have vmemmap pages.

Recently, I closed the same window in the hugetlb freeing code paths with
commit 32c877191e02 'hugetlb: do not clear hugetlb dtor until allocating'.
This patch needs to be reworked so that this window is not opened in the
allocation paths.
--
Mike Kravetz