Re: [PATCH] mm: shmem: enable thp migration (Re: [PATCH v1] mm: consider non-anonymous thp as unmovable page)

From: Michal Hocko
Date: Wed Apr 11 2018 - 15:43:47 EST


On Wed 11-04-18 12:27:39, Andrew Morton wrote:
> On Wed, 11 Apr 2018 11:26:11 +0200 Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> > On Fri 06-04-18 03:07:11, Naoya Horiguchi wrote:
> > > >From e31ec037701d1cc76b26226e4b66d8c783d40889 Mon Sep 17 00:00:00 2001
> > > From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> > > Date: Fri, 6 Apr 2018 10:58:35 +0900
> > > Subject: [PATCH] mm: enable thp migration for shmem thp
> > >
> > > My testing for the latest kernel supporting thp migration showed an
> > > infinite loop in offlining the memory block that is filled with shmem
> > > thps. We can get out of the loop with a signal, but kernel should
> > > return with failure in this case.
> > >
> > > What happens in the loop is that scan_movable_pages() repeats returning
> > > the same pfn without any progress. That's because page migration always
> > > fails for shmem thps.
> > >
> > > In memory offline code, memory blocks containing unmovable pages should
> > > be prevented from being offline targets by has_unmovable_pages() inside
> > > start_isolate_page_range().
> > >
> > > So it's possible to change migratability
> > > for non-anonymous thps to avoid the issue, but it introduces more complex
> > > and thp-specific handling in migration code, so it might not good.
> > >
> > > So this patch is suggesting to fix the issue by enabling thp migration
> > > for shmem thp. Both of anon/shmem thp are migratable so we don't need
> > > precheck about the type of thps.
> > >
> > > Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early")
> > > Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> > > Cc: stable@xxxxxxxxxxxxxxx # v4.15+
> >
> > I do not really feel qualified to give my ack but this is the right
> > approach for the fix. We simply do expect that LRU pages are migrateable
> > as well as zone_movable pages.
> >
> > Andrew, do you plan to take it (with Kirill's ack).
> >
>
> Sure. What happened with "Michal's fix in another email"
> (https://lkml.kernel.org/r/20180406051452.GB23467@xxxxxxxxxxxxxxxxxxxxxxxxxxxx)?

I guess you meant http://lkml.kernel.org/r/20180405190405.GS6312@xxxxxxxxxxxxxx

Well, that would be a workaround in case we didn't have a proper fix. It
is much simpler but it wouldn't make backporting to older kernels any
easier because it depends on other non-trivial changes you already have
in your tree. So having a full THP pagecache migration support is
preferred of course.

--
Michal Hocko
SUSE Labs