Re: [PATCH] net: stmmac: protect statistics updates with a spinlock

From: Petr Tesařík
Date: Fri Jan 05 2024 - 08:27:52 EST


Hi Eric,

yeah, it's me again...

On Fri, 5 Jan 2024 12:14:47 +0100
Petr Tesařík <petr@xxxxxxxxxxx> wrote:

> On Fri, 5 Jan 2024 11:48:19 +0100
> Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> > On Fri, Jan 5, 2024 at 11:34 AM Petr Tesařík <petr@xxxxxxxxxxx> wrote:
> > >
> > > On Fri, 5 Jan 2024 10:58:42 +0100
> > > Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > >
> > > > On Fri, Jan 5, 2024 at 10:16 AM Petr Tesarik <petr@xxxxxxxxxxx> wrote:
> > > > >
> > > > > Add a spinlock to fix race conditions while updating Tx/Rx statistics.
> > > > >
> > > > > As explained by a comment in <linux/u64_stats_sync.h>, write side of struct
> > > > > u64_stats_sync must ensure mutual exclusion, or one seqcount update could
> > > > > be lost on 32-bit platforms, thus blocking readers forever.
> > > > >
> > > > > Such lockups have been actually observed on 32-bit Arm after stmmac_xmit()
> > > > > on one core raced with stmmac_napi_poll_tx() on another core.
> > > > >
> > > > > Signed-off-by: Petr Tesarik <petr@xxxxxxxxxxx>
> > > >
> > > > This is going to add more costs to 64bit platforms ?
> > >
> > > Yes, it adds a (hopefully not too contended) spinlock and in most
> > > places an interrupt disable/enable pair.
> > >
> > > FWIW the race condition is also present on 64-bit platforms, resulting
> > > in inaccurate statistic counters. I can understand if you consider it a
> > > mild annoyance, not worth fixing.
> > >
> > > > It seems to me that the same syncp can be used from two different
> > > > threads : hard irq and napi poller...
> > >
> > > Yes, that's exactly the scenario that locks up my system.
> > >
> > > > At this point, I do not see why you keep linux/u64_stats_sync.h if you
> > > > decide to go for a spinlock...
> > >
> > > The spinlock does not havce to be taken on the reader side, so the
> > > seqcounter still adds some value.
> > >
> > > > Alternative would use atomic64_t fields for the ones where there is no
> > > > mutual exclusion.
> > > >
> > > > RX : napi poll is definitely safe (protected by an atomic bit)
> > > > TX : each TX queue is also safe (protected by an atomic exclusion for
> > > > non LLTX drivers)
> > > >
> > > > This leaves the fields updated from hardware interrupt context ?
> > >
> > > I'm afraid I don't have enough network-stack-foo to follow here.
> > >
> > > My issue on 32 bit is that stmmac_xmit() may be called directly from
> > > process context while another core runs the TX napi on the same channel
> > > (in interrupt context). I didn't observe any race on the RX path, but I
> > > believe it's possible with NAPI busy polling.
> > >
> > > In any case, I don't see the connection with LLTX. Maybe you want to
> > > say that the TX queue is safe for stmmac (because it is a non-LLTX
> > > driver), but might not be safe for LLTX drivers?
> >
> > LLTX drivers (mostly virtual drivers like tunnels...) can have multiple cpus
> > running ndo_start_xmit() concurrently. So any use of a 'shared syncp'
> > would be a bug.
> > These drivers usually use per-cpu stats, to avoid races and false
> > sharing anyway.
> >
> > I think you should split the structures into two separate groups, each
> > guarded with its own syncp.
> >
> > No extra spinlocks, no extra costs on 64bit arches...
> >
> > If TX completion can run in parallel with ndo_start_xmit(), then
> > clearly we have to split stmmac_txq_stats in two halves:
>
> Oh, now I get it. Yes, that's much better, indeed.
>
> I mean, the counters have never been consistent (due to the race on the
> writer side), and nobody is concerned. So, there is no value in taking
> a consistent snapshot in stmmac_get_ethtool_stats().
>
> I'm going to rework and retest my patch. Thank you for pointing me in
> the right direction!
>
> Petr T
>
> > Also please note the conversion from u64 to u64_stats_t
>
> Noted. IIUC this will in turn close the update race on 64-bit by using
> an atomic type and on 32-bit by using a seqlock. Clever.
>
> Petr T
>
> > Very partial patch, only to show the split and new structure :
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h
> > b/drivers/net/ethernet/stmicro/stmmac/common.h
> > index e3f650e88f82f927f0dcf95748fbd10c14c30cbe..702bceea5dc8c875a80f5e3a92b7bb058f373eda
> > 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> > +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> > @@ -60,16 +60,22 @@
> > /* #define FRAME_FILTER_DEBUG */
> >
> > struct stmmac_txq_stats {
> > - u64 tx_bytes;
> > - u64 tx_packets;
> > - u64 tx_pkt_n;
> > - u64 tx_normal_irq_n;
> > - u64 napi_poll;
> > - u64 tx_clean;
> > - u64 tx_set_ic_bit;
> > - u64 tx_tso_frames;
> > - u64 tx_tso_nfrags;
> > - struct u64_stats_sync syncp;
> > +/* First part, updated from ndo_start_xmit(), protected by tx queue lock */
> > + struct u64_stats_sync syncp_tx;
> > + u64_stats_t tx_bytes;
> > + u64_stats_t tx_packets;
> > + u64_stats_t tx_pkt_n;
> > + u64_stats_t tx_tso_frames;
> > + u64_stats_t tx_tso_nfrags;
> > +
> > +/* Second part, updated from TX completion (protected by NAPI poll logic) */
> > + struct u64_stats_sync syncp_tx_completion;
> > + u64_stats_t napi_poll;
> > + u64_stats_t tx_clean;
> > + u64_stats_t tx_set_ic_bit;

Unfortunately, this field is also updated from ndo_start_xmit():

4572) if (set_ic)
4573) txq_stats->tx_set_ic_bit++;

I feel it would be a shame to introduce a spinlock just for this one
update. But I think the field could be converted to an atomic64_t.

Which raises a question: Why aren't all stat counters simply atomic64_t? There
is no guarantee that the reader side takes a consistent snapshot
(except on 32-bit). So, why do we even bother with u64_stats_sync?

Is it merely because u64_stats_add() should be cheaper than
atomic64_add()? Or is there anything else I'm missing? If yes, does it
invalidate my proposal to convert tx_set_ic_bit to an atomic64_t?

Petr T