Re: [PATCH v2 4/5] mm: use node_page_state_snapshot to avoid deviation

From: Michal Hocko
Date: Tue Dec 19 2017 - 07:43:24 EST


On Tue 19-12-17 14:39:25, Kemi Wang wrote:
> To avoid deviation, this patch uses node_page_state_snapshot instead of
> node_page_state for node page stats query.
> e.g. cat /proc/zoneinfo
> cat /sys/devices/system/node/node*/vmstat
> cat /sys/devices/system/node/node*/numastat
>
> As it is a slow path and would not be read frequently, I would worry about
> it.

The changelog doesn't explain why these counters needs any special
treatment. _snapshot variants where used only for internal handling
where the precision really mattered. We do not have any in-tree user and
Jack has removed this by http://lkml.kernel.org/r/20171122094416.26019-1-jack@xxxxxxx
which is already sitting in the mmotm tree. We can re-add it but that
would really require a _very good_ reason.

> Signed-off-by: Kemi Wang <kemi.wang@xxxxxxxxx>
> ---
> drivers/base/node.c | 17 ++++++++++-------
> mm/vmstat.c | 2 +-
> 2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index a045ea1..cf303f8 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -169,12 +169,15 @@ static ssize_t node_read_numastat(struct device *dev,
> "interleave_hit %lu\n"
> "local_node %lu\n"
> "other_node %lu\n",
> - node_page_state(NODE_DATA(dev->id), NUMA_HIT),
> - node_page_state(NODE_DATA(dev->id), NUMA_MISS),
> - node_page_state(NODE_DATA(dev->id), NUMA_FOREIGN),
> - node_page_state(NODE_DATA(dev->id), NUMA_INTERLEAVE_HIT),
> - node_page_state(NODE_DATA(dev->id), NUMA_LOCAL),
> - node_page_state(NODE_DATA(dev->id), NUMA_OTHER));
> + node_page_state_snapshot(NODE_DATA(dev->id), NUMA_HIT),
> + node_page_state_snapshot(NODE_DATA(dev->id), NUMA_MISS),
> + node_page_state_snapshot(NODE_DATA(dev->id),
> + NUMA_FOREIGN),
> + node_page_state_snapshot(NODE_DATA(dev->id),
> + NUMA_INTERLEAVE_HIT),
> + node_page_state_snapshot(NODE_DATA(dev->id), NUMA_LOCAL),
> + node_page_state_snapshot(NODE_DATA(dev->id),
> + NUMA_OTHER));
> }
>
> static DEVICE_ATTR(numastat, S_IRUGO, node_read_numastat, NULL);
> @@ -194,7 +197,7 @@ static ssize_t node_read_vmstat(struct device *dev,
> for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
> n += sprintf(buf+n, "%s %lu\n",
> vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
> - node_page_state(pgdat, i));
> + node_page_state_snapshot(pgdat, i));
>
> return n;
> }
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 64e08ae..d65f28d 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1466,7 +1466,7 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
> for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
> seq_printf(m, "\n %-12s %lu",
> vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
> - node_page_state(pgdat, i));
> + node_page_state_snapshot(pgdat, i));
> }
> }
> seq_printf(m,
> --
> 2.7.4
>

--
Michal Hocko
SUSE Labs