[RFC PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

From: Mike Kravetz
Date: Wed Jul 24 2019 - 13:51:44 EST


From: Hillf Danton <hdanton@xxxxxxxx>

Address the issue of should_continue_reclaim continuing true too often
for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned.
This could happen during hugetlb page allocation causing stalls for
minutes or hours.

Restructure code so that false will be returned in this case even if
there are plenty of inactive pages.

Need better description from Hillf Danton
Need SOB from Hillf Danton
---
mm/vmscan.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index f4fd02ae233e..484b6b1a954e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2673,18 +2673,6 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
return false;
}

- /*
- * If we have not reclaimed enough pages for compaction and the
- * inactive lists are large enough, continue reclaiming
- */
- pages_for_compaction = compact_gap(sc->order);
- inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE);
- if (get_nr_swap_pages() > 0)
- inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON);
- if (sc->nr_reclaimed < pages_for_compaction &&
- inactive_lru_pages > pages_for_compaction)
- return true;
-
/* If compaction would go ahead or the allocation would succeed, stop */
for (z = 0; z <= sc->reclaim_idx; z++) {
struct zone *zone = &pgdat->node_zones[z];
@@ -2700,7 +2688,21 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
;
}
}
- return true;
+
+ /*
+ * If we have not reclaimed enough pages for compaction and the
+ * inactive lists are large enough, continue reclaiming
+ */
+ pages_for_compaction = compact_gap(sc->order);
+ inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE);
+ if (get_nr_swap_pages() > 0)
+ inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON);
+
+ return inactive_lru_pages > pages_for_compaction &&
+ /*
+ * avoid dryrun with plenty of inactive pages
+ */
+ nr_scanned && nr_reclaimed;
}

static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg)
--
2.20.1