Re: [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark

From: Mel Gorman
Date: Mon Oct 23 2023 - 05:27:00 EST


On Fri, Oct 20, 2023 at 11:30:53AM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On Mon, Oct 16, 2023 at 01:30:01PM +0800, Huang Ying wrote:
> >> One target of PCP is to minimize pages in PCP if the system free pages
> >> is too few. To reach that target, when page reclaiming is active for
> >> the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in
> >> allocating path, decrease PCP high and free some pages in freeing
> >> path. But this may be too late because the background page reclaiming
> >> may introduce latency for some workloads. So, in this patch, during
> >> page allocation we will detect whether the number of free pages of the
> >> zone is below high watermark. If so, we will stop increasing PCP high
> >> in allocating path, decrease PCP high and free some pages in freeing
> >> path. With this, we can reduce the possibility of the premature
> >> background page reclaiming caused by too large PCP.
> >>
> >> The high watermark checking is done in allocating path to reduce the
> >> overhead in hotter freeing path.
> >>
> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> >> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> >> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> >> Cc: David Hildenbrand <david@xxxxxxxxxx>
> >> Cc: Johannes Weiner <jweiner@xxxxxxxxxx>
> >> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> >> Cc: Michal Hocko <mhocko@xxxxxxxx>
> >> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
> >> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> >> Cc: Christoph Lameter <cl@xxxxxxxxx>
> >> ---
> >> include/linux/mmzone.h | 1 +
> >> mm/page_alloc.c | 33 +++++++++++++++++++++++++++++++--
> >> 2 files changed, 32 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index ec3f7daedcc7..c88770381aaf 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -1018,6 +1018,7 @@ enum zone_flags {
> >> * Cleared when kswapd is woken.
> >> */
> >> ZONE_RECLAIM_ACTIVE, /* kswapd may be scanning the zone. */
> >> + ZONE_BELOW_HIGH, /* zone is below high watermark. */
> >> };
> >>
> >> static inline unsigned long zone_managed_pages(struct zone *zone)
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 8382ad2cdfd4..253fc7d0498e 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -2407,7 +2407,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone,
> >> return min(batch << 2, pcp->high);
> >> }
> >>
> >> - if (pcp->count >= high && high_min != high_max) {
> >> + if (high_min == high_max)
> >> + return high;
> >> +
> >> + if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) {
> >> + pcp->high = max(high - (batch << pcp->free_factor), high_min);
> >> + high = max(pcp->count, high_min);
> >> + } else if (pcp->count >= high) {
> >> int need_high = (batch << pcp->free_factor) + batch;
> >>
> >> /* pcp->high should be large enough to hold batch freed pages */
> >> @@ -2457,6 +2463,10 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
> >> if (pcp->count >= high) {
> >> free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high),
> >> pcp, pindex);
> >> + if (test_bit(ZONE_BELOW_HIGH, &zone->flags) &&
> >> + zone_watermark_ok(zone, 0, high_wmark_pages(zone),
> >> + ZONE_MOVABLE, 0))
> >> + clear_bit(ZONE_BELOW_HIGH, &zone->flags);
> >> }
> >> }
> >>
> >
> > This is a relatively fast path and freeing pages should not need to check
> > watermarks.
>
> Another stuff that mitigate the overhead is that the watermarks checking
> only occurs when we free pages from PCP to buddy. That is, in most
> cases, every 63 page freeing.
>

True

> > While the overhead is mitigated because it applies only when
> > the watermark is below high, that is also potentially an unbounded condition
> > if a workload is sized precisely enough. Why not clear this bit when kswapd
> > is going to sleep after reclaiming enough pages in a zone?
>
> IIUC, if the number of free pages is kept larger than the low watermark,
> then kswapd will have no opportunity to be waken up even if the number
> of free pages was ever smaller than the high watermark.
>

Also true and I did think of that later. I guess it's ok, the chances
are that the series overall offsets any micro-costs like this so I'm
happy. If, for some reason, this overhead is noticable (doubtful), then
it can be revisted.

Thanks.

--
Mel Gorman
SUSE Labs