Re: [PATCH] net/mlx5e: fix high stack usage

From: Saeed Mahameed
Date: Fri Nov 02 2018 - 20:52:32 EST


On Fri, 2018-11-02 at 23:15 +0100, Arnd Bergmann wrote:
> On 11/2/18, Saeed Mahameed <saeedm@xxxxxxxxxxxx> wrote:
> > On Fri, 2018-11-02 at 14:39 -0700, Eric Dumazet wrote:
> > >
> > > On 11/02/2018 02:05 PM, Saeed Mahameed wrote:
> > >
> > > > temp will be mem copied to priv->stats.sw at the end,
> > > > memcpy(&priv->stats.sw, &s, sizeof(s));
> > > >
> > > > one other way to solve this as suggested by Andrew, is to get
> > > > rid
> > > > of
> > > > the temp var and make it point directly to priv->stats.sw
> > > >
> > >
> > > What about concurrency ?
> > >
> > > This temp variable is there to make sure concurrent readers of
> > > stats
> > > might
> > > not see mangle data (because another 'reader' just did a memset()
> > > and
> > > is doing the folding)
> > >
> > >
> > > mlx5e_get_stats() can definitely be run at the same time by
> > > multiple
> > > threads.
> > >
> >
> > hmm, you are right, i was thinking that mlx5e_get_Stats will
> > trigger a
> > work to update stats and grab the state_lock, but for sw stats this
> > is
> > not the case it is done in place.
> >
> > BTW memcpy itself is not thread safe.
>
> Before commit 6c63efe4cfab ("net/mlx5e: Remove redundant
> active_channels
> indication"), there was a read_lock() in the function apparently
> intended to
> made it thread safe. This got removed with the comment
>
> commit 6c63efe4cfabf230a8ed4b1d880249875ffdac13
> Author: Eran Ben Elisha <eranbe@xxxxxxxxxxxx>
> Date: Tue May 29 11:06:31 2018 +0300
>
> net/mlx5e: Remove redundant active_channels indication
>
> Now, when all channels stats are saved regardless of the
> channel's state
> {open, closed}, we can safely remove this indication and the
> stats spin
> lock which protects it.
>
> Fixes: 76c3810bade3 ("net/mlx5e: Avoid reset netdev stats on
> configuration changes")
>
> I don't really understand the reasoning, but maybe we can remove
> the memcpy() if the code is thread safe, or we need the lock back if
> it's not.
>

this lock was needed for a whole different purpose, it wasn't meant to
synchronize between two reader threads, it was meant to synchronize
between driver restarts and the reader for loop which ran over the open
channels, while they could be going through a destruction process.

I think all we need is to maintain two priv->stats.sw copies and use
them as temp for each reader thread, when can only have two concurrent
readers (mlx5e_ethtool_get_ethtool_stats and ndo_get_stats64) ..

> Arnd