Re: [PATCHv2] mm: skip CMA pages when they are not available

From: Huang, Ying
Date: Fri Apr 21 2023 - 02:47:08 EST


"zhaoyang.huang" <zhaoyang.huang@xxxxxxxxxx> writes:

> From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
>
> This patch fixes unproductive reclaiming of CMA pages by skipping them when they
> are not available for current context. It is arise from bellowing OOM issue, which
> caused by large proportion of MIGRATE_CMA pages among free pages. There has been
> commit(168676649) to fix it by trying CMA pages first instead of fallback in
> rmqueue. I would like to propose another one from reclaiming perspective.
>
> 04166 < 4> [ 36.172486] [03-19 10:05:52.172] ActivityManager: page allocation failure: order:0, mode:0xc00(GFP_NOIO), nodemask=(null),cpuset=foreground,mems_allowed=0
> 0419C < 4> [ 36.189447] [03-19 10:05:52.189] DMA32: 0*4kB 447*8kB (C) 217*16kB (C) 124*32kB (C) 136*64kB (C) 70*128kB (C) 22*256kB (C) 3*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 35848kB
> 0419D < 4> [ 36.193125] [03-19 10:05:52.193] Normal: 231*4kB (UMEH) 49*8kB (MEH) 14*16kB (H) 13*32kB (H) 8*64kB (H) 2*128kB (H) 0*256kB 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 3236kB
> ......
> 041EA < 4> [ 36.234447] [03-19 10:05:52.234] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
> 041EB < 4> [ 36.234455] [03-19 10:05:52.234] cache: ext4_io_end, object size: 64, buffer size: 64, default order: 0, min order: 0
> 041EC < 4> [ 36.234459] [03-19 10:05:52.234] node 0: slabs: 53,objs: 3392, free: 0

>From the above description, you are trying to resolve an issue that has
been resolved already. If so, why do we need your patch? What is the
issue it try to resolve in current upstream kernel?

At the first glance, I don't think your patch doesn't make sense. But
you really need to show the value of the patch.

Best Regards,
Huang, Ying

> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> ---
> v2: update commit message and fix build error when CONFIG_CMA is not set
> ---
> ---
> mm/vmscan.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bd6637f..19fb445 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2225,10 +2225,16 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> unsigned long skipped = 0;
> unsigned long scan, total_scan, nr_pages;
> + bool cma_cap = true;
> + struct page *page;
> LIST_HEAD(folios_skipped);
>
> total_scan = 0;
> scan = 0;
> + if ((IS_ENABLED(CONFIG_CMA)) && !current_is_kswapd()
> + && (gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE))
> + cma_cap = false;
> +
> while (scan < nr_to_scan && !list_empty(src)) {
> struct list_head *move_to = src;
> struct folio *folio;
> @@ -2239,12 +2245,17 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx) {
> + page = &folio->page;
> +
> + if ((folio_zonenum(folio) > sc->reclaim_idx)
> +#ifdef CONFIG_CMA
> + || (get_pageblock_migratetype(page) == MIGRATE_CMA && !cma_cap)
> +#endif
> + ) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> goto move;
> }
> -
> /*
> * Do not count skipped folios because that makes the function
> * return with no isolated folios if the LRU mostly contains