Re: [PATCH 1/7] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry

From: Hugh Dickins
Date: Fri Jun 04 2021 - 17:29:39 EST


On Fri, 4 Jun 2021, Kirill A. Shutemov wrote:
> On Tue, Jun 01, 2021 at 02:05:45PM -0700, Hugh Dickins wrote:
> > Stressing huge tmpfs page migration racing hole punch often crashed on the
> > VM_BUG_ON(!pmd_present) in pmdp_huge_clear_flush(), with DEBUG_VM=y kernel;
> > or shortly afterwards, on a bad dereference in __split_huge_pmd_locked()
> > when DEBUG_VM=n. They forgot to allow for pmd migration entries in the
> > non-anonymous case.
> >
> > Full disclosure: those particular experiments were on a kernel with more
> > relaxed mmap_lock and i_mmap_rwsem locking, and were not repeated on the
> > vanilla kernel: it is conceivable that stricter locking happens to avoid
> > those cases, or makes them less likely; but __split_huge_pmd_locked()
> > already allowed for pmd migration entries when handling anonymous THPs,
> > so this commit brings the shmem and file THP handling into line.
> >
> > Are there more places that need to be careful about pmd migration entries?
> > None hit in practice, but several of those is_huge_zero_pmd() tests were
> > done without checking pmd_present() first: I believe a pmd migration entry
> > could end up satisfying that test. Ah, the inversion of swap offset, to
> > protect against L1TF, makes that impossible on x86; but other arches need
> > the pmd_present() check, and even x86 ought not to apply pmd_page() to a
> > swap-like pmd. Fix those instances; __split_huge_pmd_locked() was not
> > wrong to be checking with pmd_trans_huge() instead, but I think it's
> > clearer to use pmd_present() in each instance.
> >
> > And while there: make it clearer to the eye that the !vma_is_anonymous()
> > and is_huge_zero_pmd() blocks make early returns (and don't return void).
> >
> > Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp")
> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> > Cc: <stable@xxxxxxxxxxxxxxx>
>
> Looks like a two fixes got squashed into one patch. Zero-page fix and
> migration entries in __split_huge_pmd_locked() deserve separate patches,
> no?

Okay, I'll divide in two (and probably lose the "don't return void"
cleanup; but still keep the clearer separation of those two blocks).

>
> Maybe add VM_BUG_ON(!pmd_present()) in is_huge_zero_pmd()?

Certainly not as part of any patch I'm aiming at stable! But
I've remembered another approach, I'll say in response to Yang.

>
> Also I wounder how much code we can remove if we would not establish
> migration ptes for file pages. We can make these page table entries 'none'
> on migration.

I'm not sure how far you're wondering to go with that (just in THP
case, or file ptes generally?). But you may recall that I disagree,
especially on mlocked vmas, where we break the contract by not using
migration entries. Anyway, not something to get into here.

Thanks a lot for all your reviews, I'll get on with it.

Hugh