RE: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page

From: Wang, Haiyue
Date: Wed Aug 17 2022 - 20:33:03 EST


> -----Original Message-----
> From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> Sent: Thursday, August 18, 2022 05:58
> To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Michael Ellerman <mpe@xxxxxxxxxxxxxx>
> Cc: Wang, Haiyue <haiyue.wang@xxxxxxxxx>; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> david@xxxxxxxxxx; apopple@xxxxxxxxxx; linmiaohe@xxxxxxxxxx; Huang, Ying <ying.huang@xxxxxxxxx>;
> songmuchun@xxxxxxxxxxxxx; naoya.horiguchi@xxxxxxxxx; alex.sierra@xxxxxxx; Heiko Carstens
> <hca@xxxxxxxxxxxxx>; Vasily Gorbik <gor@xxxxxxxxxxxxx>; Alexander Gordeev <agordeev@xxxxxxxxxxxxx>;
> Christian Borntraeger <borntraeger@xxxxxxxxxxxxx>; Sven Schnelle <svens@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>
> On 08/17/22 10:26, Mike Kravetz wrote:
> > On 08/16/22 22:43, Andrew Morton wrote:
> > > On Wed, 17 Aug 2022 03:31:37 +0000 "Wang, Haiyue" <haiyue.wang@xxxxxxxxx> wrote:
> > >
> > > > > > }
> > > > >
> > > > > I would be better to fix this for real at those three client code sites?
> > > >
> > > > Then 5.19 will break for a while to wait for the final BIG patch ?
> > >
> > > If that's the proposal then your [1/2] should have had a cc:stable and
> > > changelog words describing the plan for 6.0.
> > >
> > > But before we do that I'd like to see at least a prototype of the final
> > > fixes to s390 and hugetlb, so we can assess those as preferable for
> > > backporting. I don't think they'll be terribly intrusive or risky?
> >
> > I will start on adding follow_huge_pgd() support. Although, I may need
> > some help with verification from the powerpc folks, as that is the only
> > architecture which supports hugetlb pages at that level.
> >
> > mpe any suggestions?
>
> From 4925a98a6857dbb5a23bd97063ded2648863e65e Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> Date: Wed, 17 Aug 2022 14:32:10 -0700
> Subject: [PATCH] hugetlb: make follow_huge_pgd support FOLL_GET
>
> The existing version of follow_huge_pgd was very primitive and only
> provided limited functionality. Specifically, it did not support
> FOLL_GET. Update follow_huge_pgd with modifications similar to those
> made for follow_huge_pud in commit 3a194f3f8ad0 ("mm/hugetlb: make
> pud_huge() and follow_huge_pud() aware of non-present pud entry").
>
> Note, common code should be factored out of follow_huge_p*d routines.
> This will be done in future modifications.
>

I found "Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>" submit the similar
patch on "Apr 2016 11:07:37 +0530"

[PATCH 03/10] mm/hugetlb: Protect follow_huge_(pud|pgd) functions from race
https://lore.kernel.org/all/1460007464-26726-4-git-send-email-khandual@xxxxxxxxxxxxxxxxxx/

> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> ---
> mm/hugetlb.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ea1c7bfa1cc3..6f32d2bd1ca9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7055,10 +7055,38 @@ follow_huge_pud(struct mm_struct *mm, unsigned long address,
> struct page * __weak
> follow_huge_pgd(struct mm_struct *mm, unsigned long address, pgd_t *pgd, int flags)
> {
> - if (flags & (FOLL_GET | FOLL_PIN))
> + struct page *page = NULL;
> + spinlock_t *ptl;
> + pte_t pte;
> +
> + if (WARN_ON_ONCE(flags & FOLL_PIN))
> return NULL;
>
> - return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT);
> +retry:
> + ptl = huge_pte_lock(hstate_sizelog(PGDIR_SHIFT), mm, (pte_t *)pgd);
> + if (!pgd_huge(*pgd))
> + goto out;
> + pte = huge_ptep_get((pte_t *)pgd);
> + if (pte_present(pte)) {
> + page = pgd_page(*pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT);
> + if (WARN_ON_ONCE(!try_grab_page(page, flags))) {
> + page = NULL;
> + goto out;
> + }
> + } else {
> + if (is_hugetlb_entry_migration(pte)) {
> + spin_unlock(ptl);
> + __migration_entry_wait(mm, (pte_t *)pgd, ptl);
> + goto retry;
> + }
> + /*
> + * hwpoisoned entry is treated as no_page_table in
> + * follow_page_mask().
> + */
> + }
> +out:
> + spin_unlock(ptl);
> + return page;
> }
>
> int isolate_hugetlb(struct page *page, struct list_head *list)
> --
> 2.37.1