Re: [PATCH V5 1/4] virtio_ring: validate used buffer length

From: Stefano Garzarella
Date: Mon Nov 22 2021 - 02:42:54 EST


On Mon, Nov 22, 2021 at 11:51:09AM +0800, Jason Wang wrote:
On Fri, Nov 19, 2021 at 11:10 PM Halil Pasic <pasic@xxxxxxxxxxxxx> wrote:

On Wed, 27 Oct 2021 10:21:04 +0800
Jason Wang <jasowang@xxxxxxxxxx> wrote:

> This patch validate the used buffer length provided by the device
> before trying to use it. This is done by record the in buffer length
> in a new field in desc_state structure during virtqueue_add(), then we
> can fail the virtqueue_get_buf() when we find the device is trying to
> give us a used buffer length which is greater than the in buffer
> length.
>
> Since some drivers have already done the validation by themselves,
> this patch tries to makes the core validation optional. For the driver
> that doesn't want the validation, it can set the
> suppress_used_validation to be true (which could be overridden by
> force_used_validation module parameter). To be more efficient, a
> dedicate array is used for storing the validate used length, this
> helps to eliminate the cache stress if validation is done by the
> driver.
>
> Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>

Hi Jason!

Our CI has detected, that virtio-vsock became unusable with this
patch on s390x. I didn't test on x86 yet. The guest kernel says
something like:
vmw_vsock_virtio_transport virtio1: tx: used len 44 is larger than in buflen 0

Did you, or anybody else, see something like this on platforms other that
s390x?

Adding Stefan and Stefano.

I think it should be a common issue, looking at

Yep, I confirm the same behaviour on x86_64. On Friday evening I had the same failure while testing latest QEMU and Linux kernel.

vhost_vsock_handle_tx_kick(), it did:

len += sizeof(pkt->hdr);
vhost_add_used(vq, head, len);

which looks like a violation of the spec since it's TX.


I had a quick look at this code, and I speculate that it probably
uncovers a pre-existig bug, rather than introducing a new one.

I agree.


If somebody is already working on this please reach out to me.


My plan was to debug and test it today, so let me know if you need some help.

AFAIK, no. I think the plan is to fix both the device and drive side
(but I'm not sure we need a new feature for this if we stick to the
validation).


Yes, maybe we need a new feature, since I believe there has been this problem since the beginning.

Thanks,
Stefano