Re: [PATCH RFC 07/12] mm/gup: Refactor record_subpages() to find 1st small page

From: Matthew Wilcox
Date: Thu Nov 16 2023 - 09:52:02 EST


On Wed, Nov 15, 2023 at 08:29:03PM -0500, Peter Xu wrote:
> All the fast-gup functions take a tail page to operate, always need to do
> page mask calculations before feeding that into record_subpages().
>
> Merge that logic into record_subpages(), so that we always take a head
> page, and leave the rest calculation to record_subpages().

This is a bit fragile. You're assuming that pmd_page() always returns
a head page, and that's only true today because I looked at the work
required vs the reward and decided to cap the large folio size at PMD
size. If we allowed 2*PMD_SIZE (eg 4MB on x86), pmd_page() would not
return a head page. There is a small amount of demand for > PMD size
large folio support, so I suspect we will want to do this eventually.
I'm not particularly trying to do these conversions, but it would be
good to not add more assumptions that pmd_page() returns a head page.

> +static int record_subpages(struct page *head, unsigned long sz,
> + unsigned long addr, unsigned long end,
> + struct page **pages)

> @@ -2870,8 +2873,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> pages, nr);
> }
>
> - page = nth_page(pmd_page(orig), (addr & ~PMD_MASK) >> PAGE_SHIFT);
> - refs = record_subpages(page, addr, end, pages + *nr);
> + page = pmd_page(orig);
> + refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr);
>
> folio = try_grab_folio(page, refs, flags);
> if (!folio)