Re: [PATCH] mm, compaction: fix NR_ISOLATED_* stats for pfn based migration

From: Michal Hocko
Date: Wed Oct 19 2016 - 10:41:58 EST


On Wed 19-10-16 11:39:36, Vlastimil Babka wrote:
> On 10/19/2016 10:02 AM, Michal Hocko wrote:
> > From: Ming Ling <ming.ling@xxxxxxxxxxxxxx>
> >
> > Since bda807d44454 ("mm: migrate: support non-lru movable page
> > migration") isolate_migratepages_block) can isolate !PageLRU pages which
> > would acct_isolated account as NR_ISOLATED_*. Accounting these non-lru
> > pages NR_ISOLATED_{ANON,FILE} doesn't make any sense and it can misguide
> > heuristics based on those counters such as pgdat_reclaimable_pages resp.
> > too_many_isolated which would lead to unexpected stalls during the
> > direct reclaim without any good reason. Note that
> > __alloc_contig_migrate_range can isolate a lot of pages at once.
> >
> > On mobile devices such as 512M ram android Phone, it may use a big zram
> > swap. In some cases zram(zsmalloc) uses too many non-lru but migratedable
> > pages, such as:
> >
> > MemTotal: 468148 kB
> > Normal free:5620kB
> > Free swap:4736kB
> > Total swap:409596kB
> > ZRAM: 164616kB(zsmalloc non-lru pages)
> > active_anon:60700kB
> > inactive_anon:60744kB
> > active_file:34420kB
> > inactive_file:37532kB
> >
> > Fix this by only accounting lru pages to NR_ISOLATED_* in
> > isolate_migratepages_block right after they were isolated and we still
> > know they were on LRU. Drop acct_isolated because it is called after the
> > fact and we've lost that information. Batching per-cpu counter doesn't
> > make much improvement anyway. Also make sure that we uncharge only LRU
> > pages when putting them back on the LRU in putback_movable_pages resp.
> > when unmap_and_move migrates the page.
>
> [mhocko@xxxxxxxx: replace acct_isolated() with direct counting]
> ?

Why not. I just considered this patch more as a rework of the original
than an incremental fix. But whatever...

> Indeed much better than before. IIRC I've personally introduced one or two
> bugs involving acct_isolated() (lack of) usage :) Thanks.

Yeah, it was subtle as hell.

> > Fixes: bda807d44454 ("mm: migrate: support non-lru movable page migration")
> > Acked-by: Minchan Kim <minchan@xxxxxxxxxx>
> > Signed-off-by: Ming Ling <ming.ling@xxxxxxxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
>
> Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

Thanks!
--
Michal Hocko
SUSE Labs