Re: [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd

From: Johannes Weiner
Date: Wed Apr 26 2023 - 11:22:14 EST


On Wed, Apr 26, 2023 at 09:30:23AM +0800, Huang, Ying wrote:
> Johannes Weiner <hannes@xxxxxxxxxxx> writes:
>
> > On Tue, Apr 25, 2023 at 11:12:28AM +0800, Huang, Ying wrote:
> >> Johannes Weiner <hannes@xxxxxxxxxxx> writes:
> >>
> >> > Kswapd currently bails on higher-order allocations with an open-coded
> >> > check for whether it's reclaimed the compaction gap.
> >> >
> >> > compaction_suitable() is the customary interface to coordinate reclaim
> >> > with compaction.
> >> >
> >> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> >> > ---
> >> > mm/vmscan.c | 67 ++++++++++++++++++-----------------------------------
> >> > 1 file changed, 23 insertions(+), 44 deletions(-)
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index ee8c8ca2e7b5..723705b9e4d9 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
> >> > if (!managed_zone(zone))
> >> > continue;
> >> >
> >> > + /* Allocation can succeed in any zone, done */
> >> > if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING)
> >> > mark = wmark_pages(zone, WMARK_PROMO);
> >> > else
> >> > mark = high_wmark_pages(zone);
> >> > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
> >> > return true;
> >> > +
> >> > + /* Allocation can't succeed, but enough order-0 to compact */
> >> > + if (compaction_suitable(zone, order,
> >> > + highest_zoneidx) == COMPACT_CONTINUE)
> >> > + return true;
> >>
> >> Should we check the following first?
> >>
> >> order > 0 && zone_watermark_ok_safe(zone, 0, mark, highest_zoneidx)
> >
> > That's what compaction_suitable() does. It checks whether there are
> > enough migration targets for compaction (COMPACT_CONTINUE) or whether
> > reclaim needs to do some more work (COMPACT_SKIPPED).
>
> Yes. And I found that the watermark used in compaction_suitable() is
> low_wmark_pages() or min_wmark_pages(), which doesn't match the
> watermark here. Or did I miss something?

Ahh, you're right, kswapd will bail prematurely. Compaction cannot
reliably meet the high watermark with a low watermark scratch space.

I'll add the order check before the suitable test, for clarity, and so
that order-0 requests don't check the same thing twice.

For the watermark, I'd make it an arg to compaction_suitable() and use
whatever the reclaimer targets (high for kswapd, min for direct).

However, there is a minor snag. compaction_suitable() currently has
its own smarts regarding the watermark:

/*
* Watermarks for order-0 must be met for compaction to be able to
* isolate free pages for migration targets. This means that the
* watermark and alloc_flags have to match, or be more pessimistic than
* the check in __isolate_free_page(). We don't use the direct
* compactor's alloc_flags, as they are not relevant for freepage
* isolation. We however do use the direct compactor's highest_zoneidx
* to skip over zones where lowmem reserves would prevent allocation
* even if compaction succeeds.
* For costly orders, we require low watermark instead of min for
* compaction to proceed to increase its chances.
* ALLOC_CMA is used, as pages in CMA pageblocks are considered
* suitable migration targets
*/
watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
low_wmark_pages(zone) : min_wmark_pages(zone);

Historically it has always checked low instead of min. Then Vlastimil
changed it to min for non-costly orders here:

commit 8348faf91f56371d4bada6fc5915e19580a15ffe
Author: Vlastimil Babka <vbabka@xxxxxxx>
Date: Fri Oct 7 16:58:00 2016 -0700

mm, compaction: require only min watermarks for non-costly orders

The __compaction_suitable() function checks the low watermark plus a
compact_gap() gap to decide if there's enough free memory to perform
compaction. Then __isolate_free_page uses low watermark check to decide
if particular free page can be isolated. In the latter case, using low
watermark is needlessly pessimistic, as the free page isolations are
only temporary. For __compaction_suitable() the higher watermark makes
sense for high-order allocations where more freepages increase the
chance of success, and we can typically fail with some order-0 fallback
when the system is struggling to reach that watermark. But for
low-order allocation, forming the page should not be that hard. So
using low watermark here might just prevent compaction from even trying,
and eventually lead to OOM killer even if we are above min watermarks.

So after this patch, we use min watermark for non-costly orders in
__compaction_suitable(), and for all orders in __isolate_free_page().

Lowering to min wasn't an issue for non-costly, but AFAICS there was
no explicit testing for whether min would work for costly orders too.

I'd propose trying it with min even for costly and see what happens.

If it does regress, a better place to boost scratch space for costly
orders might be compact_gap(), so I'd move it there.

Does that sound reasonable?