Re: [RFC v1 0/6] virtio/vsock: introduce SOCK_DGRAM support

From: Jason Wang
Date: Thu Jun 10 2021 - 03:47:17 EST



在 2021/6/10 下午3:23, Stefano Garzarella 写道:
On Thu, Jun 10, 2021 at 12:02:35PM +0800, Jason Wang wrote:

在 2021/6/10 上午11:43, Jiang Wang . 写道:
On Wed, Jun 9, 2021 at 6:51 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:

在 2021/6/10 上午7:24, Jiang Wang 写道:
This patchset implements support of SOCK_DGRAM for virtio
transport.

Datagram sockets are connectionless and unreliable. To avoid unfair contention
with stream and other sockets, add two more virtqueues and
a new feature bit to indicate if those two new queues exist or not.

Dgram does not use the existing credit update mechanism for
stream sockets. When sending from the guest/driver, sending packets
synchronously, so the sender will get an error when the virtqueue is full.
When sending from the host/device, send packets asynchronously
because the descriptor memory belongs to the corresponding QEMU
process.

What's the use case for the datagram vsock?

One use case is for non critical info logging from the guest
to the host, such as the performance data of some applications.


Anything that prevents you from using the stream socket?



It can also be used to replace UDP communications between
the guest and the host.


Any advantage for VSOCK in this case? Is it for performance (I guess not since I don't exepct vsock will be faster).

I think the general advantage to using vsock are for the guest agents that potentially don't need any configuration.


Right, I wonder if we really need datagram consider the host to guest communication is reliable.

(Note that I don't object it since vsock has already supported that, just wonder its use cases)




An obvious drawback is that it breaks the migration. Using UDP you can have a very rich features support from the kernel where vsock can't.


Thanks for bringing this up!
What features does UDP support and datagram on vsock could not support?


E.g the sendpage() and busy polling. And using UDP means qdiscs and eBPF can work.





The virtio spec patch is here:
https://www.spinics.net/lists/linux-virtualization/msg50027.html

Have a quick glance, I suggest to split mergeable rx buffer into an
separate patch.
Sure.

But I think it's time to revisit the idea of unifying the virtio-net and
virtio-vsock. Otherwise we're duplicating features and bugs.
For mergeable rxbuf related code, I think a set of common helper
functions can be used by both virtio-net and virtio-vsock. For other
parts, that may not be very beneficial. I will think about more.

If there is a previous email discussion about this topic, could you send me
some links? I did a quick web search but did not find any related
info. Thanks.


We had a lot:

[1] https://patchwork.kernel.org/project/kvm/patch/5BDFF537.3050806@xxxxxxxxxx/
[2] https://lists.linuxfoundation.org/pipermail/virtualization/2018-November/039798.html
[3] https://www.lkml.org/lkml/2020/1/16/2043


When I tried it, the biggest problem that blocked me were all the features strictly related to TCP/IP stack and ethernet devices that vsock device doesn't know how to handle: TSO, GSO, checksums, MAC, napi, xdp, min ethernet frame size, MTU, etc.


It depends on which level we want to share:

1) sharing codes
2) sharing devices
3) make vsock a protocol that is understood by the network core

We can start from 1), the low level tx/rx logic can be shared at both virtio-net and vhost-net. For 2) we probably need some work on the spec, probably with a new feature bit to demonstrate that it's a vsock device not a ethernet device. Then if it is probed as a vsock device we won't let packet to be delivered in the TCP/IP stack. For 3), it would be even harder and I'm not sure it's worth to do that.



So in my opinion to unify them is not so simple, because vsock is not really an ethernet device, but simply a socket.


We can start from sharing codes.



But I fully agree that we shouldn't duplicate functionality and code, so maybe we could find those common parts and create helpers to be used by both.


Yes.

Thanks



Thanks,
Stefano