Re: [PATCH net 03/13] mptcp: fix lockless access in subflow ULP diag

From: Eric Dumazet
Date: Mon Feb 19 2024 - 12:35:57 EST


On Mon, Feb 19, 2024 at 6:21 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Thu, Feb 15, 2024 at 7:25 PM Matthieu Baerts (NGI0)
> <matttbe@xxxxxxxxxx> wrote:
> >
> > From: Paolo Abeni <pabeni@xxxxxxxxxx>
> >
> > Since the introduction of the subflow ULP diag interface, the
> > dump callback accessed all the subflow data with lockless.
> >
> > We need either to annotate all the read and write operation accordingly,
> > or acquire the subflow socket lock. Let's do latter, even if slower, to
> > avoid a diffstat havoc.
> >
> > Fixes: 5147dfb50832 ("mptcp: allow dumping subflow context to userspace")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Paolo Abeni <pabeni@xxxxxxxxxx>
> > Reviewed-by: Mat Martineau <martineau@xxxxxxxxxx>
> > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@xxxxxxxxxx>
> > ---
> > Notes:
> > - This patch modifies the existing ULP API. No better solutions have
> > been found for -net, and there is some similar prior art, see
> > commit 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info").
> >
> > Please also note that TLS ULP Diag has likely the same issue.
> > To: Boris Pismenny <borisp@xxxxxxxxxx>
> > To: John Fastabend <john.fastabend@xxxxxxxxx>
> > ---
> > include/net/tcp.h | 2 +-
> > net/mptcp/diag.c | 6 +++++-
> > net/tls/tls_main.c | 2 +-
> > 3 files changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index dd78a1181031..f6eba9652d01 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -2506,7 +2506,7 @@ struct tcp_ulp_ops {
> > /* cleanup ulp */
> > void (*release)(struct sock *sk);
> > /* diagnostic */
> > - int (*get_info)(const struct sock *sk, struct sk_buff *skb);
> > + int (*get_info)(struct sock *sk, struct sk_buff *skb);
> > size_t (*get_info_size)(const struct sock *sk);
> > /* clone ulp */
> > void (*clone)(const struct request_sock *req, struct sock *newsk,
> > diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c
> > index a536586742f2..e57c5f47f035 100644
> > --- a/net/mptcp/diag.c
> > +++ b/net/mptcp/diag.c
> > @@ -13,17 +13,19 @@
> > #include <uapi/linux/mptcp.h>
> > #include "protocol.h"
> >
> > -static int subflow_get_info(const struct sock *sk, struct sk_buff *skb)
> > +static int subflow_get_info(struct sock *sk, struct sk_buff *skb)
> > {
> > struct mptcp_subflow_context *sf;
> > struct nlattr *start;
> > u32 flags = 0;
> > + bool slow;
> > int err;
> >
> > start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
> > if (!start)
> > return -EMSGSIZE;
> >
> > + slow = lock_sock_fast(sk);
> > rcu_read_lock();
>
> I am afraid lockdep is not happy with this change.
>
> Paolo, we probably need the READ_ONCE() annotations after all.

Or perhaps something like the following would be enough.

diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c
index 6ff6f14674aa2941bc04c680bacd9f79fc65060d..7017dd60659dc7133318c1c82e3f429bea3a5d57
100644
--- a/net/mptcp/diag.c
+++ b/net/mptcp/diag.c
@@ -21,6 +21,9 @@ static int subflow_get_info(struct sock *sk, struct
sk_buff *skb)
bool slow;
int err;

+ if (inet_sk_state_load(sk) == TCP_LISTEN)
+ return 0;
+
start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
if (!start)
return -EMSGSIZE;