Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

From: Michal Hocko
Date: Thu Aug 30 2018 - 12:20:01 EST


On Thu 30-08-18 10:08:25, Jerome Glisse wrote:
> On Thu, Aug 30, 2018 at 12:56:16PM +0200, Michal Hocko wrote:
> > On Wed 29-08-18 17:11:07, Jerome Glisse wrote:
> > > On Wed, Aug 29, 2018 at 08:39:06PM +0200, Michal Hocko wrote:
> > > > On Wed 29-08-18 14:14:25, Jerome Glisse wrote:
> > > > > On Wed, Aug 29, 2018 at 10:24:44AM -0700, Mike Kravetz wrote:
> > > > [...]
> > > > > > What would be the best mmu notifier interface to use where there are no
> > > > > > start/end calls?
> > > > > > Or, is the best solution to add the start/end calls as is done in later
> > > > > > versions of the code? If that is the suggestion, has there been any change
> > > > > > in invalidate start/end semantics that we should take into account?
> > > > >
> > > > > start/end would be the one to add, 4.4 seems broken in respect to THP
> > > > > and mmu notification. Another solution is to fix user of mmu notifier,
> > > > > they were only a handful back then. For instance properly adjust the
> > > > > address to match first address covered by pmd or pud and passing down
> > > > > correct page size to mmu_notifier_invalidate_page() would allow to fix
> > > > > this easily.
> > > > >
> > > > > This is ok because user of try_to_unmap_one() replace the pte/pmd/pud
> > > > > with an invalid one (either poison, migration or swap) inside the
> > > > > function. So anyone racing would synchronize on those special entry
> > > > > hence why it is fine to delay mmu_notifier_invalidate_page() to after
> > > > > dropping the page table lock.
> > > > >
> > > > > Adding start/end might the solution with less code churn as you would
> > > > > only need to change try_to_unmap_one().
> > > >
> > > > What about dependencies? 369ea8242c0fb sounds like it needs work for all
> > > > notifiers need to be updated as well.
> > >
> > > This commit remove mmu_notifier_invalidate_page() hence why everything
> > > need to be updated. But in 4.4 you can get away with just adding start/
> > > end and keep around mmu_notifier_invalidate_page() to minimize disruption.
> >
> > OK, this is really interesting. I was really worried to change the
> > semantic of the mmu notifiers in stable kernels because this is really
> > a hard to review change and high risk for anybody running those old
> > kernels. If we can keep the mmu_notifier_invalidate_page and wrap them
> > into the range scope API then this sounds like the best way forward.
> >
> > So just to make sure we are at the same page. Does this sounds goo for
> > stable 4.4. backport? Mike's hugetlb pmd shared fixup can be applied on
> > top. What do you think?
>
> You need to invalidate outside page table lock so before the call to
> page_check_address(). For instance like below patch, which also only
> do the range invalidation for huge page which would avoid too much of
> a behavior change for user of mmu notifier.

Right. I would rather not make this PageHuge special though. So the
fixed version should be.