[RFC PATCH v1 0/2] virtio/vsock: fix mutual rx/tx hungup

From: Arseniy Krasnov
Date: Sat Dec 17 2022 - 14:45:49 EST


Hello,

seems I found strange thing(may be a bug) where sender('tx' later) and
receiver('rx' later) could stuck forever. Potential fix is in the first
patch, second patch contains reproducer, based on vsock test suite.
Reproducer is simple: tx just sends data to rx by 'write() syscall, rx
dequeues it using 'read()' syscall and uses 'poll()' for waiting. I run
server in host and client in guest.

rx side params:
1) SO_VM_SOCKETS_BUFFER_SIZE is 256Kb(e.g. default).
2) SO_RCVLOWAT is 128Kb.

What happens in the reproducer step by step:

1) tx tries to send 256Kb + 1 byte (in a single 'write()')
2) tx sends 256Kb, data reaches rx (rx_bytes == 256Kb)
3) tx waits for space in 'write()' to send last 1 byte
4) rx does poll(), (rx_bytes >= rcvlowat) 256Kb >= 128Kb, POLLIN is set
5) rx reads 64Kb, credit update is not sent due to *
6) rx does poll(), (rx_bytes >= rcvlowat) 192Kb >= 128Kb, POLLIN is set
7) rx reads 64Kb, credit update is not sent due to *
8) rx does poll(), (rx_bytes >= rcvlowat) 128Kb >= 128Kb, POLLIN is set
9) rx reads 64Kb, credit update is not sent due to *
10) rx does poll(), (rx_bytes < rcvlowat) 64Kb < 128Kb, rx waits in poll()

* is optimization in 'virtio_transport_stream_do_dequeue()' which
sends OP_CREDIT_UPDATE only when we have not too much space -
less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.

Now tx side waits for space inside write() and rx waits in poll() for
'rx_bytes' to reach SO_RCVLOWAT value. Both sides will wait forever. I
think, possible fix is to send credit update not only when we have too
small space, but also when number of bytes in receive queue is smaller
than SO_RCVLOWAT thus not enough to wake up sleeping reader. I'm not
sure about correctness of this idea, but anyway - I think that problem
above exists. What do You think?

Patchset was rebased and tested on skbuff v7 patch from Bobby Eshleman:
https://lore.kernel.org/netdev/20221213192843.421032-1-bobby.eshleman@xxxxxxxxxxxxx/

Arseniy Krasnov(2):
virtio/vsock: send credit update depending on SO_RCVLOWAT
vsock_test: mutual hungup reproducer

net/vmw_vsock/virtio_transport_common.c | 9 +++-
tools/testing/vsock/vsock_test.c | 78 +++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+), 2 deletions(-)
--
2.25.1