RE: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page

From: Wang, Haiyue
Date: Tue Aug 16 2022 - 23:31:51 EST


> -----Original Message-----
> From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, August 17, 2022 08:59
> To: Wang, Haiyue <haiyue.wang@xxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; david@xxxxxxxxxx; apopple@xxxxxxxxxx;
> linmiaohe@xxxxxxxxxx; Huang, Ying <ying.huang@xxxxxxxxx>; songmuchun@xxxxxxxxxxxxx;
> naoya.horiguchi@xxxxxxxxx; alex.sierra@xxxxxxx; Heiko Carstens <hca@xxxxxxxxxxxxx>; Vasily Gorbik
> <gor@xxxxxxxxxxxxx>; Alexander Gordeev <agordeev@xxxxxxxxxxxxx>; Christian Borntraeger
> <borntraeger@xxxxxxxxxxxxx>; Sven Schnelle <svens@xxxxxxxxxxxxx>; Mike Kravetz
> <mike.kravetz@xxxxxxxxxx>
> Subject: Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>
> On Tue, 16 Aug 2022 10:21:00 +0800 Haiyue Wang <haiyue.wang@xxxxxxxxx> wrote:
>
> > Not all huge page APIs support FOLL_GET option, so move_pages() syscall
> > will fail to get the page node information for some huge pages.
> >
> > Like x86 on linux 5.19 with 1GB huge page API follow_huge_pud(), it will
> > return NULL page for FOLL_GET when calling move_pages() syscall with the
> > NULL 'nodes' parameter, the 'status' parameter has '-2' error in array.
> >
> > Note: follow_huge_pud() now supports FOLL_GET in linux 6.0.
> > Link: https://lore.kernel.org/all/20220714042420.1847125-3-naoya.horiguchi@xxxxxxxxx
> >
> > But these huge page APIs don't support FOLL_GET:
> > 1. follow_huge_pud() in arch/s390/mm/hugetlbpage.c
>
> Let's tell the s390 maintainers.
>
> > 2. follow_huge_addr() in arch/ia64/mm/hugetlbpage.c
> > It will cause WARN_ON_ONCE for FOLL_GET.
>
> ia64 doesn't have maintainers :( Can we come up with something local to
> arch/ia64 for this?

The 'follow_huge_addr' itself just has interest on "FOLL_WRITE"
struct page *
follow_huge_addr(struct mm_struct *mm, unsigned long address,
int write)

And arch/ia64 defines this function 17 years ago ...

But I found that "WARN_ON_ONCE for FOLL_GET" was introduced on 2005-10-29
by commit:

[PATCH] mm: follow_page with inner ptlock

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=deceb6cd17e6dfafe4c4f81b1b4153bc41b2cb70

- page = follow_huge_addr(mm, address, write);
- if (! IS_ERR(page))
- return page;
+ page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
+ if (!IS_ERR(page)) {
+ BUG_ON(flags & FOLL_GET);
+ goto out;
+ }

>
> > 3. follow_huge_pgd() in mm/hugetlb.c
>
> Hi, Mike.
>


> > }
>
> I would be better to fix this for real at those three client code sites?

Then 5.19 will break for a while to wait for the final BIG patch ?