Re: [PATCH net 03/13] mptcp: fix lockless access in subflow ULP diag

From: Paolo Abeni
Date: Mon Feb 19 2024 - 13:04:44 EST


On Mon, 2024-02-19 at 18:35 +0100, Eric Dumazet wrote:
> On Mon, Feb 19, 2024 at 6:21 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > On Thu, Feb 15, 2024 at 7:25 PM Matthieu Baerts (NGI0)
> > <matttbe@xxxxxxxxxx> wrote:
> > >
> > > From: Paolo Abeni <pabeni@xxxxxxxxxx>
> > >
> > > Since the introduction of the subflow ULP diag interface, the
> > > dump callback accessed all the subflow data with lockless.
> > >
> > > We need either to annotate all the read and write operation accordingly,
> > > or acquire the subflow socket lock. Let's do latter, even if slower, to
> > > avoid a diffstat havoc.
> > >
> > > Fixes: 5147dfb50832 ("mptcp: allow dumping subflow context to userspace")
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Paolo Abeni <pabeni@xxxxxxxxxx>
> > > Reviewed-by: Mat Martineau <martineau@xxxxxxxxxx>
> > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@xxxxxxxxxx>
> > > ---
> > > Notes:
> > > - This patch modifies the existing ULP API. No better solutions have
> > > been found for -net, and there is some similar prior art, see
> > > commit 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info").
> > >
> > > Please also note that TLS ULP Diag has likely the same issue.
> > > To: Boris Pismenny <borisp@xxxxxxxxxx>
> > > To: John Fastabend <john.fastabend@xxxxxxxxx>
> > > ---
> > > include/net/tcp.h | 2 +-
> > > net/mptcp/diag.c | 6 +++++-
> > > net/tls/tls_main.c | 2 +-
> > > 3 files changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > > index dd78a1181031..f6eba9652d01 100644
> > > --- a/include/net/tcp.h
> > > +++ b/include/net/tcp.h
> > > @@ -2506,7 +2506,7 @@ struct tcp_ulp_ops {
> > > /* cleanup ulp */
> > > void (*release)(struct sock *sk);
> > > /* diagnostic */
> > > - int (*get_info)(const struct sock *sk, struct sk_buff *skb);
> > > + int (*get_info)(struct sock *sk, struct sk_buff *skb);
> > > size_t (*get_info_size)(const struct sock *sk);
> > > /* clone ulp */
> > > void (*clone)(const struct request_sock *req, struct sock *newsk,
> > > diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c
> > > index a536586742f2..e57c5f47f035 100644
> > > --- a/net/mptcp/diag.c
> > > +++ b/net/mptcp/diag.c
> > > @@ -13,17 +13,19 @@
> > > #include <uapi/linux/mptcp.h>
> > > #include "protocol.h"
> > >
> > > -static int subflow_get_info(const struct sock *sk, struct sk_buff *skb)
> > > +static int subflow_get_info(struct sock *sk, struct sk_buff *skb)
> > > {
> > > struct mptcp_subflow_context *sf;
> > > struct nlattr *start;
> > > u32 flags = 0;
> > > + bool slow;
> > > int err;
> > >
> > > start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
> > > if (!start)
> > > return -EMSGSIZE;
> > >
> > > + slow = lock_sock_fast(sk);
> > > rcu_read_lock();
> >
> > I am afraid lockdep is not happy with this change.
> >
> > Paolo, we probably need the READ_ONCE() annotations after all.
>
> Or perhaps something like the following would be enough.
>
> diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c
> index 6ff6f14674aa2941bc04c680bacd9f79fc65060d..7017dd60659dc7133318c1c82e3f429bea3a5d57
> 100644
> --- a/net/mptcp/diag.c
> +++ b/net/mptcp/diag.c
> @@ -21,6 +21,9 @@ static int subflow_get_info(struct sock *sk, struct
> sk_buff *skb)
> bool slow;
> int err;
>
> + if (inet_sk_state_load(sk) == TCP_LISTEN)
> + return 0;
> +
> start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
> if (!start)
> return -EMSGSIZE;

Thanks for the head-up. This later option looks preferable, to avoid
quit a bit of noise with _ONCE annotation. Is there a syzkaller splat I
could look at? if it landed on the ML, I missed it.

Thanks!

Paolo