Re: [PATCH bpf-next] xsk: proper socket state check in xsk_poll

From: BjÃrn TÃpel
Date: Tue Aug 20 2019 - 11:29:26 EST


On Tue, 20 Aug 2019 at 16:30, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 8/20/19 12:04 PM, BjÃrn TÃpel wrote:
> > From: BjÃrn TÃpel <bjorn.topel@xxxxxxxxx>
> >
> > The poll() implementation for AF_XDP sockets did not perform the
> > proper state checks, prior accessing the socket umem. This patch fixes
> > that by performing a xsk_is_bound() check.
> >
> > Suggested-by: Hillf Danton <hdanton@xxxxxxxx>
> > Reported-by: syzbot+c82697e3043781e08802@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
> > Signed-off-by: BjÃrn TÃpel <bjorn.topel@xxxxxxxxx>
> > ---
> > net/xdp/xsk.c | 14 ++++++++++++--
> > 1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index ee4428a892fa..08bed5e92af4 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -356,13 +356,20 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
> > return err;
> > }
> >
> > +static bool xsk_is_bound(struct xdp_sock *xs)
> > +{
> > + struct net_device *dev = READ_ONCE(xs->dev);
> > +
> > + return dev && xs->state == XSK_BOUND;
> > +}
> > +
> > static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
> > {
> > bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
> > struct sock *sk = sock->sk;
> > struct xdp_sock *xs = xdp_sk(sk);
> >
> > - if (unlikely(!xs->dev))
> > + if (unlikely(!xsk_is_bound(xs)))
> > return -ENXIO;
> > if (unlikely(!(xs->dev->flags & IFF_UP)))
> > return -ENETDOWN;
> > @@ -383,6 +390,9 @@ static unsigned int xsk_poll(struct file *file, struct socket *sock,
> > struct net_device *dev = xs->dev;
> > struct xdp_umem *umem = xs->umem;
> >
> > + if (unlikely(!xsk_is_bound(xs)))
> > + return mask;
> > +
> > if (umem->need_wakeup)
> > dev->netdev_ops->ndo_xsk_wakeup(dev, xs->queue_id,
> > umem->need_wakeup);
> > @@ -417,7 +427,7 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
> > {
> > struct net_device *dev = xs->dev;
> >
> > - if (!dev || xs->state != XSK_BOUND)
> > + if (!xsk_is_bound(xs))
> > return;
>
> I think I'm a bit confused by your READ_ONCE() usage. ;-/ I can see why you're
> using it in xsk_is_bound() above, but then at the same time all the other callbacks
> like xsk_poll() or xsk_unbind_dev() above have a struct net_device *dev = xs->dev
> right before the test. Could you elaborate?
>

Yes, now I'm confused as well! Digging deeper... I believe there are a
couple of places in xsk.c that do not have
READ_ONCE/WRITE_ONCE-correctness. Various xdp_sock members are read
lock-less outside the control plane mutex (mutex member of struct
xdp_sock). This needs some re-work. I'll look into using the newly
introduced state member (with corresponding read/write barriers) for
this.

I'll cook some patch(es) that address this, but first it sounds like I
need to reread [1] two, or three times. At least. ;-)


Thanks,
BjÃrn


[1] https://lwn.net/Articles/793253/


> Thanks,
> Daniel