Re: [PATCHv7 3/3] vhost_net: a kernel-level virtio server

From: Paul E. McKenney
Date: Tue Nov 03 2009 - 18:57:51 EST


On Tue, Nov 03, 2009 at 01:14:06PM -0500, Gregory Haskins wrote:
> Gregory Haskins wrote:
> > Eric Dumazet wrote:
> >> Michael S. Tsirkin a écrit :
> >>> +static void handle_tx(struct vhost_net *net)
> >>> +{
> >>> + struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> >>> + unsigned head, out, in, s;
> >>> + struct msghdr msg = {
> >>> + .msg_name = NULL,
> >>> + .msg_namelen = 0,
> >>> + .msg_control = NULL,
> >>> + .msg_controllen = 0,
> >>> + .msg_iov = vq->iov,
> >>> + .msg_flags = MSG_DONTWAIT,
> >>> + };
> >>> + size_t len, total_len = 0;
> >>> + int err, wmem;
> >>> + size_t hdr_size;
> >>> + struct socket *sock = rcu_dereference(vq->private_data);
> >>> + if (!sock)
> >>> + return;
> >>> +
> >>> + wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> >>> + if (wmem >= sock->sk->sk_sndbuf)
> >>> + return;
> >>> +
> >>> + use_mm(net->dev.mm);
> >>> + mutex_lock(&vq->mutex);
> >>> + vhost_no_notify(vq);
> >>> +
> >> using rcu_dereference() and mutex_lock() at the same time seems wrong, I suspect
> >> that your use of RCU is not correct.
> >>
> >> 1) rcu_dereference() should be done inside a read_rcu_lock() section, and
> >> we are not allowed to sleep in such a section.
> >> (Quoting Documentation/RCU/whatisRCU.txt :
> >> It is illegal to block while in an RCU read-side critical section, )
> >>
> >> 2) mutex_lock() can sleep (ie block)
> >>
> >
> >
> > Michael,
> > I warned you that this needed better documentation ;)
> >
> > Eric,
> > I think I flagged this once before, but Michael convinced me that it
> > was indeed "ok", if but perhaps a bit unconventional. I will try to
> > find the thread.
> >
> > Kind Regards,
> > -Greg
> >
>
> Here it is:
>
> http://lkml.org/lkml/2009/8/12/173

What was happening in that case was that the rcu_dereference()
was being used in a workqueue item. The role of rcu_read_lock()
was taken on be the start of execution of the workqueue item, of
rcu_read_unlock() by the end of execution of the workqueue item, and
of synchronize_rcu() by flush_workqueue(). This does work, at least
assuming that flush_workqueue() operates as advertised, which it appears
to at first glance.

The above code looks somewhat different, however -- I don't see
handle_tx() being executed in the context of a work queue. Instead
it appears to be in an interrupt handler.

So what is the story? Using synchronize_irq() or some such?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/