Re: [EXTERNAL] [PATCH] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

From: Zach O'Keefe
Date: Mon Aug 14 2023 - 14:49:19 EST


On Sat, Aug 12, 2023 at 11:19 PM Saurabh Singh Sengar
<ssengar@xxxxxxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> > Sent: Sunday, August 13, 2023 2:31 AM
> > To: linux-mm@xxxxxxxxx; Yang Shi <shy828301@xxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; Zach O'Keefe <zokeefe@xxxxxxxxxx>;
> > Saurabh Singh Sengar <ssengar@xxxxxxxxxxxxx>
> > Subject: [EXTERNAL] [PATCH] mm/thp: fix "mm: thp: kill
> > __transhuge_page_enabled()"
> >
> > [You don't often get email from zokeefe@xxxxxxxxxx. Learn why this is
> > important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > The 6.0 commits:
> >
> > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") commit
> > 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> >
> > merged "can we have THPs in this VMA?" logic that was previously done
> > separately by fault-path, khugepaged, and smaps "THPeligible".
> >
> > During the process, the check on VM_NO_KHUGEPAGED from the
> > khugepaged path was accidentally added to fault and smaps paths. Certainly
> > the previous behavior for fault should be restored, and since smaps should
> > report the union of THP eligibility for fault and khugepaged, also opt smaps
> > out of this constraint.
> >
> > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> > Reported-by: Saurabh Singh Sengar <ssengar@xxxxxxxxxxxxx>
> > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> > Cc: Yang Shi <shy828301@xxxxxxxxx>
> > ---
> > mm/huge_memory.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c index
> > eb3678360b97..e098c26d5e2e 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -96,11 +96,11 @@ bool hugepage_vma_check(struct vm_area_struct
> > *vma, unsigned long vm_flags,
> > return in_pf;
> >
> > /*
> > - * Special VMA and hugetlb VMA.
> > + * khugepaged check for special VMA and hugetlb VMA.
> > * Must be checked after dax since some dax mappings may have
> > * VM_MIXEDMAP set.
> > */
> > - if (vm_flags & VM_NO_KHUGEPAGED)
> > + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED))
> > return false;
> >
> > /*
> > --
> > 2.41.0.694.ge786442a9b-goog
>
> Thanks for the patch, I realized with the commit 9fec51689ff60,
> !vma_is_anonymous restriction is also introduced. To make fault path
> work same as before we need relaxation for this check as well. Can we
> add the below as will in this patch:
>
> - if (!vma_is_anonymous(vma))
> + if (!is_pf && !vma_is_anonymous(vma))
> return false;

Hey Saurabh,

Thanks for pointing this out, and sorry for the mixup.

I'll try looping in some folks from DAX and fs worlds to be sure,
since my knowledge doesn't extend far into those realms.

I was under the understanding that CONFIG_READ_ONLY_THP_FOR_FS was
supposed to keep the filesystem blissfully unaware of hugepages; IOW,
that assembling file-backed hugepages was supposed to be a
pagecache-only thing .. or be DAX.

The early check:

if (vma_is_dax(vma))
return in_pf;

Should handle the DAX case.

IIUC, the check, lower down:

if (!in_pf && file_thp_enabled(vma))
return true;

Was supposed to be the last check for eligible file-backed memory, and
here it's clear that we don't support faulting-in hugepages over
file-backed memory.

Looking at current users of struct vm_operations_struct->huge_fault, I see:

drivers/dax/device.c : dev_dax_huge_fault
fs/ext4/file.c : ext4_dax_huge_fault
fs/xfs/xfs_file.c : xfs_filemap_huge_fault
fs/erofs/data.c : erofs_dax_huge_fault
fs/fuse/dax.c: fuse_dax_huge_fault

All of which *look* like they operate on DAX-backed memory (I checked
the xfs handler, it does so as well) -- so they should have been
whitelisted by the vma_is_dax() check.

All this to say, the kernel doesn't _currently_ support faulting-in
hugepages over non-DAX file-backed memory. However, it seems we don't
give that ->huge_fault handler a fair shake.

Saurabh, does your use case fall outside this?

Willy -- I'm not up-to-date on what is happening on the THP-fs front.
Should we be checking for a ->huge_fault handler here?

Thanks,
Zach

> - Saurabh
>