Re: [PATCH 2/2] mm/vmstat: Protect per cpu variables with preempt disable on RT

From: Thomas Gleixner
Date: Tue Aug 03 2021 - 19:54:52 EST


Mel!

On Fri, Jul 23 2021 at 11:00, Mel Gorman wrote:
> From: Ingo Molnar <mingo@xxxxxxx>
>
> Disable preemption on -RT for the vmstat code. On vanila the code runs
> in IRQ-off regions while on -RT it may not when stats are updated under
> a local_lock. "preempt_disable" ensures that the same resources is not
> updated in parallel due to preemption.
>
> This patch differs from the preempt-rt version where __count_vm_event and
> __count_vm_events are also protected. The counters are explicitly "allowed
> to be to be racy" so there is no need to protect them from preemption. Only
> the accurate page stats that are updated by a read-modify-write need
> protection.
>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> ---
> mm/vmstat.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index b0534e068166..d06332c221b1 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -319,6 +319,7 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> long x;
> long t;
>
> + preempt_disable_rt();

Yes, this is smart to some extent. But in reality it's a bandaid simply
because nobody can tell which item of vmstat requires which protection.

If you go back in RT history then you will figure out that we were able
to eliminate _all_ occurences of preempt_disable_rt() except for this
one.

Even mm developers are wary about this:

<tglx> so in vmstat.c there is this magic comment:
<tglx> * For use when we know that interrupts are disabled
<tglx> * or when we know that preemption is disabled and that
<tglx> * particular counter cannot be updated from interrupt context.
<tglx> how can I know which counters need what?
<mm_expert> I don't think there's a list, one would have to check on counter to counter basis :/
<tglx> and of course there is nothing which validates that, right?
<mm_expert> exactly

Brilliant stuff which prevents you to do any validation on this. Over
the years there have been several issues where callers had to be fixed
by analysing bug reports instead of having a proper instrumentation in
that code which would have told the developer that he got it wrong.

Of course on RT kernels the preempt_disable_rt() will serialize
everything correctly, but as we have learned over the years just
slapping _if_rt() or if_not_rt() variants of things around is most of
the time papering over the underlying problem of badly defined
protection scopes. Let's not proliferate that. As I said in the above
IRC conversation:

<tglx> I fundamentally hate this preempt_disable_rt() muck

Thanks,

tglx