Re: [PATCH v5] virtio/vsock: replace virtio_vsock_pkt with sk_buff

From: Paolo Abeni
Date: Tue Dec 06 2022 - 05:21:27 EST


Hello,

On Fri, 2022-12-02 at 09:35 -0800, Bobby Eshleman wrote:
[...]
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> index 35d7eedb5e8e..6c0b2d4da3fe 100644
> --- a/include/linux/virtio_vsock.h
> +++ b/include/linux/virtio_vsock.h
> @@ -3,10 +3,129 @@
> #define _LINUX_VIRTIO_VSOCK_H
>
> #include <uapi/linux/virtio_vsock.h>
> +#include <linux/bits.h>
> #include <linux/socket.h>
> #include <net/sock.h>
> #include <net/af_vsock.h>
>
> +#define VIRTIO_VSOCK_SKB_HEADROOM (sizeof(struct virtio_vsock_hdr))
> +
> +enum virtio_vsock_skb_flags {
> + VIRTIO_VSOCK_SKB_FLAGS_REPLY = BIT(0),
> + VIRTIO_VSOCK_SKB_FLAGS_TAP_DELIVERED = BIT(1),
> +};
> +
> +static inline struct virtio_vsock_hdr *virtio_vsock_hdr(struct sk_buff *skb)
> +{
> + return (struct virtio_vsock_hdr *)skb->head;
> +}
> +
> +static inline bool virtio_vsock_skb_reply(struct sk_buff *skb)
> +{
> + return skb->_skb_refdst & VIRTIO_VSOCK_SKB_FLAGS_REPLY;
> +}

I'm sorry for the late feedback. The above is extremelly risky: if the
skb will land later into the networking stack, we could experience the
most difficult to track bugs.

You should use the skb control buffer instead (skb->cb), with the
additional benefit you could use e.g. bool - the compiler could emit
better code to manipulate such fields - and you will not need to clear
the field before release nor enqueue.

[...]

> @@ -352,37 +360,38 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> size_t len)
> {
> struct virtio_vsock_sock *vvs = vsk->trans;
> - struct virtio_vsock_pkt *pkt;
> size_t bytes, total = 0;
> - u32 free_space;
> + struct sk_buff *skb;
> int err = -EFAULT;
> + u32 free_space;
>
> spin_lock_bh(&vvs->rx_lock);
> - while (total < len && !list_empty(&vvs->rx_queue)) {
> - pkt = list_first_entry(&vvs->rx_queue,
> - struct virtio_vsock_pkt, list);
> + while (total < len && !skb_queue_empty_lockless(&vvs->rx_queue)) {
> + skb = __skb_dequeue(&vvs->rx_queue);

Here the locking schema is confusing. It looks like vvs->rx_queue is
under vvs->rx_lock protection, so the above should be skb_queue_empty()
instead of the lockless variant.

[...]

> @@ -858,16 +873,11 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t,
> static void virtio_transport_remove_sock(struct vsock_sock *vsk)
> {
> struct virtio_vsock_sock *vvs = vsk->trans;
> - struct virtio_vsock_pkt *pkt, *tmp;
>
> /* We don't need to take rx_lock, as the socket is closing and we are
> * removing it.
> */
> - list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
> - list_del(&pkt->list);
> - virtio_transport_free_pkt(pkt);
> - }
> -
> + virtio_vsock_skb_queue_purge(&vvs->rx_queue);

Still assuming rx_queue is under the rx_lock, given you don't need the
locking here as per the above comment, you should use the lockless
purge variant.

Thanks!

Paolo