Re: [PATCH 1/2] vdpa/mlx5: Avoid unnecessary query virtqueue

From: Eli Cohen
Date: Tue Feb 02 2021 - 04:25:20 EST


On Tue, Feb 02, 2021 at 12:38:51AM -0800, Si-Wei Liu wrote:
> Thanks Eli and Jason for clarifications. See inline.
>
> On Mon, Feb 1, 2021 at 11:06 PM Eli Cohen <elic@xxxxxxxxxx> wrote:
> >
> > On Tue, Feb 02, 2021 at 02:02:25PM +0800, Jason Wang wrote:
> > >
> > > On 2021/2/2 下午12:15, Si-Wei Liu wrote:
> > > > On Mon, Feb 1, 2021 at 7:13 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > > >
> > > > > On 2021/2/2 上午3:17, Si-Wei Liu wrote:
> > > > > > On Mon, Feb 1, 2021 at 10:51 AM Si-Wei Liu <siwliu.kernel@xxxxxxxxx> wrote:
> > > > > > > On Thu, Jan 28, 2021 at 5:46 AM Eli Cohen <elic@xxxxxxxxxx> wrote:
> > > > > > > > suspend_vq should only suspend the VQ on not save the current available
> > > > > > > > index. This is done when a change of map occurs when the driver calls
> > > > > > > > save_channel_info().
> > > > > > > Hmmm, suspend_vq() is also called by teardown_vq(), the latter of
> > > > > > > which doesn't save the available index as save_channel_info() doesn't
> > > > > > > get called in that path at all. How does it handle the case that
> > > > > > > aget_vq_state() is called from userspace (e.g. QEMU) while the
> > > > > > > hardware VQ object was torn down, but userspace still wants to access
> > > > > > > the queue index?
> > > > > > >
> > > > > > > Refer to https://lore.kernel.org/netdev/1601583511-15138-1-git-send-email-si-wei.liu@xxxxxxxxxx/
> > > > > > >
> > > > > > > vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
> > > > > > > vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)
> > > > > > >
> > > > > > > QEMU will complain with the above warning while VM is being rebooted
> > > > > > > or shut down.
> > > > > > >
> > > > > > > Looks to me either the kernel driver should cover this requirement, or
> > > > > > > the userspace has to bear the burden in saving the index and not call
> > > > > > > into kernel if VQ is destroyed.
> > > > > > Actually, the userspace doesn't have the insights whether virt queue
> > > > > > will be destroyed if just changing the device status via set_status().
> > > > > > Looking at other vdpa driver in tree i.e. ifcvf it doesn't behave like
> > > > > > so. Hence this still looks to me to be Mellanox specifics and
> > > > > > mlx5_vdpa implementation detail that shouldn't expose to userspace.
> > > > >
> > > > > So I think we can simply drop this patch?
> > > > Yep, I think so. To be honest I don't know why it has anything to do
> > > > with the memory hotplug issue.
> > >
> > >
> > > Eli may know more, my understanding is that, during memory hotplut, qemu
> > > need to propagate new memory mappings via set_map(). For mellanox, it means
> > > it needs to rebuild memory keys, so the virtqueue needs to be suspended.
> > >
> >
> > I think Siwei was asking why the first patch was related to the hotplug
> > issue.
>
> I was thinking how consistency is assured when saving/restoring this
> h/w avail_index against the one in the virtq memory, particularly in
> the region_add/.region_del() context (e.g. the hotplug case). Problem
> is I don't see explicit memory barrier when guest thread updates the
> avail_index, how does the device make sure the h/w avail_index is
> uptodate while guest may race with updating the virtq's avail_index in
> the mean while? Maybe I completely miss something obvious?

If you're asking about syncronization upon hot plug of memory, the
hardware always goes to read the available index from memory when a new
hardware object is associted with a virtqueue. You can argue then that
you don't need to restore the available index and you may be right but
this is the currect inteface to the firmware.


If you're asking on generally how sync is assured when the guest updates
the available index, can you please send a pointer to the code where you
see the update without a memory barrier?

>
> Thanks,
> -Siwei
>
> >
> > But you're correct. When memory is added, I get a new memory map. This
> > requires me to build a new memory key object which covers the new memory
> > map. Since the virtqueue objects are referencing this memory key, I need
> > to destroy them and build new ones referncing the new memory key.
> >
> > > Thanks
> > >
> > >
> > > >
> > > > -Siwei
> > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > > -Siwei
> > > > > > >
> > > > > > >
> > > > > > > > Signed-off-by: Eli Cohen <elic@xxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 --------
> > > > > > > > 1 file changed, 8 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > index 88dde3455bfd..549ded074ff3 100644
> > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > > > > > > @@ -1148,8 +1148,6 @@ static int setup_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
> > > > > > > >
> > > > > > > > static void suspend_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *mvq)
> > > > > > > > {
> > > > > > > > - struct mlx5_virtq_attr attr;
> > > > > > > > -
> > > > > > > > if (!mvq->initialized)
> > > > > > > > return;
> > > > > > > >
> > > > > > > > @@ -1158,12 +1156,6 @@ static void suspend_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *m
> > > > > > > >
> > > > > > > > if (modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))
> > > > > > > > mlx5_vdpa_warn(&ndev->mvdev, "modify to suspend failed\n");
> > > > > > > > -
> > > > > > > > - if (query_virtqueue(ndev, mvq, &attr)) {
> > > > > > > > - mlx5_vdpa_warn(&ndev->mvdev, "failed to query virtqueue\n");
> > > > > > > > - return;
> > > > > > > > - }
> > > > > > > > - mvq->avail_idx = attr.available_index;
> > > > > > > > }
> > > > > > > >
> > > > > > > > static void suspend_vqs(struct mlx5_vdpa_net *ndev)
> > > > > > > > --
> > > > > > > > 2.29.2
> > > > > > > >
> > >