Re: [PATCH v12 2/6] mm/vmstat: Use vmstat_dirty to track CPU-specific vmstat discrepancies

From: Marcelo Tosatti
Date: Wed Jan 04 2023 - 07:37:03 EST


On Fri, Dec 30, 2022 at 02:21:32PM +0100, Frederic Weisbecker wrote:
> On Tue, Dec 27, 2022 at 09:11:39AM -0300, Marcelo Tosatti wrote:
> > @@ -606,6 +608,7 @@ static inline void mod_zone_state(struct
> >
> > if (z)
> > zone_page_state_add(z, zone, item);
> > + vmstat_mark_dirty();
> > }
> >
> > void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> > @@ -674,6 +677,7 @@ static inline void mod_node_state(struct
> >
> > if (z)
> > node_page_state_add(z, pgdat, item);
> > + vmstat_mark_dirty();
>
> Looking at this further, about the two above chunks, there is a risk to
> mark the wrong CPU dirty because those functions are preemptible and rely
> on this_cpu_cmpxchg() to deal with preemption.
>
> Thanks.

Hi Frederic,

Yes, good catch: if the CPU is preempted after this_cpu_cmpxchg(),
but before vmstat_mark_dirty, then one ends up with a CPU with
per-CPU vm counters dirty and the per-CPU vmstat dirty bit unset.

This could cause a CPU to remain with the per-CPU vm counters dirty
for longer than sysctl_stat_interval.

Should move vmstat_mark_dirty() above "if (z)", then do
preempt_disable() on function entry and preempt_enable()
after vmstat_mark_dirty. Luckily preempt_disable()/preempt_enable()
is much cheaper than local_irq_disable()/local_irq_enable().