Re: [External] Re: [v2 1/6] mm: hugetlb: Skip prep of tail pages when HVO is enabled

From: Usama Arif
Date: Wed Aug 02 2023 - 06:07:21 EST

On 01/08/2023 03:04, Muchun Song wrote:

On 2023/7/30 23:16, Usama Arif wrote:
When vmemmap is optimizable, it will free all the duplicated tail
pages in hugetlb_vmemmap_optimize while preparing the new hugepage.
Hence, there is no need to prepare them.

For 1G x86 hugepages, it avoids preparing
262144 - 64 = 262080 struct pages per hugepage.

The indirection of using __prep_compound_gigantic_folio is also removed,
as it just creates extra functions to indicate demote which can be done
with the argument.

Signed-off-by: Usama Arif <usama.arif@xxxxxxxxxxxxx>
  mm/hugetlb.c         | 32 ++++++++++++++------------------
  mm/hugetlb_vmemmap.c |  2 +-
  mm/hugetlb_vmemmap.h | 15 +++++++++++----
  3 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 64a3239b6407..541c07b6d60f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1942,14 +1942,23 @@ static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int ni
-static bool __prep_compound_gigantic_folio(struct folio *folio,
-                    unsigned int order, bool demote)
+static bool prep_compound_gigantic_folio(struct folio *folio, struct hstate *h, bool demote)
      int i, j;
+    int order = huge_page_order(h);
      int nr_pages = 1 << order;
      struct page *p;
+    /*
+     * No need to prep pages that will be freed later by hugetlb_vmemmap_optimize.
+     * Hence, reduce nr_pages to the pages that will be kept.
+     */
+            vmemmap_should_optimize(h, &folio->page))
+        nr_pages = HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page);

We need to initialize the refcount to zero of tail pages (see the big
comment below in this function), given a situation that someone (maybe
GUP) could get a ref on the tail pages when the vmemmap is optimizing,
what prevent this from happening?


Thanks for pointing this out, will limit to boot time for solving this in next version.