Re: [PATCH 32/34] mm: vmstat: account per-zone stalls and pages skipped during reclaim

From: Johannes Weiner
Date: Tue Jul 12 2016 - 15:06:34 EST


On Fri, Jul 08, 2016 at 10:35:08AM +0100, Mel Gorman wrote:
> The vmstat allocstall was fairly useful in the general sense but
> node-based LRUs change that. It's important to know if a stall was for an
> address-limited allocation request as this will require skipping pages
> from other zones. This patch adds pgstall_* counters to replace
> allocstall. The sum of the counters will equal the old allocstall so it
> can be trivially recalculated. A high number of address-limited
> allocation requests may result in a lot of useless LRU scanning for
> suitable pages.
>
> As address-limited allocations require pages to be skipped, it's important
> to know how much useless LRU scanning took place so this patch adds
> pgskip* counters. This yields the following model
>
> 1. The number of address-space limited stalls can be accounted for (pgstall)
> 2. The amount of useless work required to reclaim the data is accounted (pgskip)
> 3. The total number of scans is available from pgscan_kswapd and pgscan_direct
> so from that the ratio of useful to useless scans can be calculated.
>
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

These statistics should be quite helpful, so:

Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>

But I have one nitpick:

> @@ -23,6 +23,8 @@
>
> enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
> FOR_ALL_ZONES(PGALLOC),
> + FOR_ALL_ZONES(PGSTALL),
> + FOR_ALL_ZONES(PGSCAN_SKIP),
> PGFREE, PGACTIVATE, PGDEACTIVATE,
> PGFAULT, PGMAJFAULT,
> PGLAZYFREED,

The PG prefix seems to stand for page, and all stat names that contain
it represent some per-page event. PGSTALL is not a page event, though.
Would you mind sticking with allocstall? allocstall_dma32 etc.