Re: [PATCH v3] vhost: cache avail index in vhost_enable_notify()

From: Stefano Garzarella
Date: Wed Feb 02 2022 - 08:53:10 EST


On Wed, Feb 02, 2022 at 06:24:05AM -0500, Michael S. Tsirkin wrote:
On Wed, Feb 02, 2022 at 11:14:30AM +0000, Stefan Hajnoczi wrote:
On Fri, Jan 28, 2022 at 10:41:29AM +0100, Stefano Garzarella wrote:
> In vhost_enable_notify() we enable the notifications and we read
> the avail index to check if new buffers have become available in
> the meantime.
>
> We do not update the cached avail index value, so when the device
> will call vhost_get_vq_desc(), it will find the old value in the
> cache and it will read the avail index again.
>
> It would be better to refresh the cache every time we read avail
> index, so let's change vhost_enable_notify() caching the value in
> `avail_idx` and compare it with `last_avail_idx` to check if there
> are new buffers available.
>
> We don't expect a significant performance boost because
> the above path is not very common, indeed vhost_enable_notify()
> is often called with unlikely(), expecting that avail index has
> not been updated.
>
> We ran virtio-test/vhost-test and noticed minimal improvement as
> expected. To stress the patch more, we modified vhost_test.ko to
> call vhost_enable_notify()/vhost_disable_notify() on every cycle
> when calling vhost_get_vq_desc(); in this case we observed a more
> evident improvement, with a reduction of the test execution time
> of about 3.7%.
>
> Signed-off-by: Stefano Garzarella <sgarzare@xxxxxxxxxx>
> ---
> v3
> - reworded commit description [Stefan]
> ---
> drivers/vhost/vhost.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 59edb5a1ffe2..07363dff559e 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2543,8 +2543,9 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> &vq->avail->idx, r);
> return false;
> }
> + vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>
> - return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
> + return vq->avail_idx != vq->last_avail_idx;
> }
> EXPORT_SYMBOL_GPL(vhost_enable_notify);

This changes behavior (fixes a bug?): previously the function returned
false when called with avail buffers still pending (vq->last_avail_idx <
vq->avail_idx). Now it returns true because we compare against
vq->last_avail_idx and I think that's reasonable.

Good catch!


Reviewed-by: Stefan Hajnoczi <stefanha@xxxxxxxxxx>

I don't see the behaviour change... could you explain the
scanario in more detail pls?

IIUC the behavior is different only when the device calls vhost_enable_notify() with pending buffers (vq->avail_idx != vq->last_avail_idx).

Let's suppose that driver has not added new available buffers, so value in cache (vq->avail_idx) is equal to the one we read back from the guest, but the device has not consumed all available buffers (vq->avail_idx != vq->last_avail_idx).

Now if the device call vhost_enable_notify(), before this patch it returned false, because there are no new buffers added (even if there are some pending), with this patch it returns true, because there are still some pending buffers (vq->avail_idx != vq->last_avail_idx).

IIUC the right behavior should be the one with the patch applied.
However this difference would be seen only if we call vhost_enable_notify() when vq->avail_idx != vq->last_avail_idx and checking vhost-net, vhost-scsi and vhost-vsock, we use the return value of vhost_enable_notify() only when there are not available buffers, so vq->avail_idx == vq->last_avail_idx.

So I think Stefan is right, but we should never experience the buggy scenario.

it seems that we used to check vq->last_avail_idx but we changed it since commit 8dd014adfea6 ("vhost-net: mergeable buffers support"), honestly I don't understand if it was intended or not.

Do you see any reason?

Thanks,
Stefano