Re: [PATCH] mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code

From: Yang Shi
Date: Tue Jul 20 2021 - 18:22:29 EST


On Tue, Jul 20, 2021 at 1:48 PM Yang Shi <shy828301@xxxxxxxxx> wrote:
>
> On Mon, Jul 19, 2021 at 11:56 PM Huang Ying <ying.huang@xxxxxxxxx> wrote:
> >
> > Before the commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault
> > handling"), the TLB flushing is done in do_huge_pmd_numa_page() itself
> > via flush_tlb_range().
> >
> > But after commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault
> > handling"), the TLB flushing is done in migrate_pages() as in the
> > following code path anyway.
> >
> > do_huge_pmd_numa_page
> > migrate_misplaced_page
> > migrate_pages
> >
> > So now, the TLB flushing code in do_huge_pmd_numa_page() becomes
> > unnecessary. So the code is deleted in this patch to simplify the
> > code. This is only code cleanup, there's no visible performance
> > difference.
>
> Yes, there is tlb flush in try_to_migrate(), but it seems mmu notifier
> invalidate is missed for the THP migration case. I'm not quite sure
> why it is not needed, maybe just missed?
>
> So, you may need the below change too:
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 2d29a57d29e8..e1c8b654563d 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1749,6 +1749,8 @@ static bool try_to_migrate_one(struct page
> *page, struct vm_area_struct *vma,
> !PageTransCompound(page), page);
>
> set_pmd_migration_entry(&pvmw, page);
> + mmu_notifier_invalidate_range(mm, range.start,
> + range.end);
> continue;
> }
> #endif

Per the discussion with Zi Yan the mmu notifier invalidate should be
not needed. The patch looks good to me. Reviewed-by: Yang Shi
<shy828301@xxxxxxxxx>

>
> >
> > Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> > Cc: Yang Shi <shy828301@xxxxxxxxx>
> > Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxx>
> > Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> > Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
> > Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
> > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> > Cc: Zi Yan <ziy@xxxxxxxxxx>
> > ---
> > mm/huge_memory.c | 26 --------------------------
> > 1 file changed, 26 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index afff3ac87067..9f21e44c9030 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1440,32 +1440,6 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
> > goto out;
> > }
> >
> > - /*
> > - * Since we took the NUMA fault, we must have observed the !accessible
> > - * bit. Make sure all other CPUs agree with that, to avoid them
> > - * modifying the page we're about to migrate.
> > - *
> > - * Must be done under PTL such that we'll observe the relevant
> > - * inc_tlb_flush_pending().
> > - *
> > - * We are not sure a pending tlb flush here is for a huge page
> > - * mapping or not. Hence use the tlb range variant
> > - */
> > - if (mm_tlb_flush_pending(vma->vm_mm)) {
> > - flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
> > - /*
> > - * change_huge_pmd() released the pmd lock before
> > - * invalidating the secondary MMUs sharing the primary
> > - * MMU pagetables (with ->invalidate_range()). The
> > - * mmu_notifier_invalidate_range_end() (which
> > - * internally calls ->invalidate_range()) in
> > - * change_pmd_range() will run after us, so we can't
> > - * rely on it here and we need an explicit invalidate.
> > - */
> > - mmu_notifier_invalidate_range(vma->vm_mm, haddr,
> > - haddr + HPAGE_PMD_SIZE);
> > - }
> > -
> > pmd = pmd_modify(oldpmd, vma->vm_page_prot);
> > page = vm_normal_page_pmd(vma, haddr, pmd);
> > if (!page)
> > --
> > 2.30.2
> >