Re: [PATCH] net: stmmac: protect statistics updates with a spinlock

From: Petr Tesařík
Date: Fri Jan 05 2024 - 05:36:12 EST


On Fri, 5 Jan 2024 10:58:42 +0100
Eric Dumazet <edumazet@xxxxxxxxxx> wrote:

> On Fri, Jan 5, 2024 at 10:16 AM Petr Tesarik <petr@xxxxxxxxxxx> wrote:
> >
> > Add a spinlock to fix race conditions while updating Tx/Rx statistics.
> >
> > As explained by a comment in <linux/u64_stats_sync.h>, write side of struct
> > u64_stats_sync must ensure mutual exclusion, or one seqcount update could
> > be lost on 32-bit platforms, thus blocking readers forever.
> >
> > Such lockups have been actually observed on 32-bit Arm after stmmac_xmit()
> > on one core raced with stmmac_napi_poll_tx() on another core.
> >
> > Signed-off-by: Petr Tesarik <petr@xxxxxxxxxxx>
>
> This is going to add more costs to 64bit platforms ?

Yes, it adds a (hopefully not too contended) spinlock and in most
places an interrupt disable/enable pair.

FWIW the race condition is also present on 64-bit platforms, resulting
in inaccurate statistic counters. I can understand if you consider it a
mild annoyance, not worth fixing.

> It seems to me that the same syncp can be used from two different
> threads : hard irq and napi poller...

Yes, that's exactly the scenario that locks up my system.

> At this point, I do not see why you keep linux/u64_stats_sync.h if you
> decide to go for a spinlock...

The spinlock does not havce to be taken on the reader side, so the
seqcounter still adds some value.

> Alternative would use atomic64_t fields for the ones where there is no
> mutual exclusion.
>
> RX : napi poll is definitely safe (protected by an atomic bit)
> TX : each TX queue is also safe (protected by an atomic exclusion for
> non LLTX drivers)
>
> This leaves the fields updated from hardware interrupt context ?

I'm afraid I don't have enough network-stack-foo to follow here.

My issue on 32 bit is that stmmac_xmit() may be called directly from
process context while another core runs the TX napi on the same channel
(in interrupt context). I didn't observe any race on the RX path, but I
believe it's possible with NAPI busy polling.

In any case, I don't see the connection with LLTX. Maybe you want to
say that the TX queue is safe for stmmac (because it is a non-LLTX
driver), but might not be safe for LLTX drivers?

Petr T