Re: [RFC 0/2] vduse: add support for networking devices

From: Jason Wang
Date: Sun Apr 23 2023 - 23:44:35 EST


On Sun, Apr 23, 2023 at 4:22 PM Yongji Xie <xieyongji@xxxxxxxxxxxxx> wrote:
>
> On Sun, Apr 23, 2023 at 2:31 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >
> > On Fri, Apr 21, 2023 at 10:28 PM Maxime Coquelin
> > <maxime.coquelin@xxxxxxxxxx> wrote:
> > >
> > >
> > >
> > > On 4/21/23 07:51, Jason Wang wrote:
> > > > On Thu, Apr 20, 2023 at 10:16 PM Maxime Coquelin
> > > > <maxime.coquelin@xxxxxxxxxx> wrote:
> > > >>
> > > >>
> > > >>
> > > >> On 4/20/23 06:34, Jason Wang wrote:
> > > >>> On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
> > > >>> <maxime.coquelin@xxxxxxxxxx> wrote:
> > > >>>>
> > > >>>> This small series enables virtio-net device type in VDUSE.
> > > >>>> With it, basic operation have been tested, both with
> > > >>>> virtio-vdpa and vhost-vdpa using DPDK Vhost library series
> > > >>>> adding VDUSE support [0] using split rings layout.
> > > >>>>
> > > >>>> Control queue support (and so multiqueue) has also been
> > > >>>> tested, but require a Kernel series from Jason Wang
> > > >>>> relaxing control queue polling [1] to function reliably.
> > > >>>>
> > > >>>> Other than that, we have identified a few gaps:
> > > >>>>
> > > >>>> 1. Reconnection:
> > > >>>> a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
> > > >>>> index, even after the virtqueue has already been
> > > >>>> processed. Is that expected? I have tried instead to
> > > >>>> get the driver's avail index directly from the avail
> > > >>>> ring, but it does not seem reliable as I sometimes get
> > > >>>> "id %u is not a head!\n" warnings. Also such solution
> > > >>>> would not be possible with packed ring, as we need to
> > > >>>> know the wrap counters values.
> > > >>>
> > > >>> Looking at the codes, it only returns the value that is set via
> > > >>> set_vq_state(). I think it is expected to be called before the
> > > >>> datapath runs.
> > > >>>
> > > >>> So when bound to virtio-vdpa, it is expected to return 0. But we need
> > > >>> to fix the packed virtqueue case, I wonder if we need to call
> > > >>> set_vq_state() explicitly in virtio-vdpa before starting the device.
> > > >>>
> > > >>> When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which
> > > >>> will end up a call to set_vq_state(). Unfortunately, it doesn't
> > > >>> support packed ring which needs some extension.
> > > >>>
> > > >>>>
> > > >>>> b. Missing IOCTLs: it would be handy to have new IOCTLs to
> > > >>>> query Virtio device status,
> > > >>>
> > > >>> What's the use case of this ioctl? It looks to me userspace is
> > > >>> notified on each status change now:
> > > >>>
> > > >>> static int vduse_dev_set_status(struct vduse_dev *dev, u8 status)
> > > >>> {
> > > >>> struct vduse_dev_msg msg = { 0 };
> > > >>>
> > > >>> msg.req.type = VDUSE_SET_STATUS;
> > > >>> msg.req.s.status = status;
> > > >>>
> > > >>> return vduse_dev_msg_sync(dev, &msg);
> > > >>> }
> > > >>
> > > >> The idea was to be able to query the status at reconnect time, and
> > > >> neither having to assume its value nor having to store its value in a
> > > >> file (the status could change while the VDUSE application is stopped,
> > > >> but maybe it would receive the notification at reconnect).
> > > >
> > > > I see.
> > > >
> > > >>
> > > >> I will prototype using a tmpfs file to save needed information, and see
> > > >> if it works.
> > > >
> > > > It might work but then the API is not self contained. Maybe it's
> > > > better to have a dedicated ioctl.
> > > >
> > > >>
> > > >>>> and retrieve the config
> > > >>>> space set at VDUSE_CREATE_DEV time.
> > > >>>
> > > >>> In order to be safe, VDUSE avoids writable config space. Otherwise
> > > >>> drivers could block on config writing forever. That's why we don't do
> > > >>> it now.
> > > >>
> > > >> The idea was not to make the config space writable, but just to be able
> > > >> to fetch what was filled at VDUSE_CREATE_DEV time.
> > > >>
> > > >> With the tmpfs file, we can avoid doing that and just save the config
> > > >> space there.
> > > >
> > > > Same as the case for status.
> > >
> > > I have cooked a DPDK patch to support reconnect with a tmpfs file as
> > > suggested by Yongji:
> > >
> > > https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/53913f2b1155b02c44d5d3d298aafd357e7a8c48
> >
> > This seems tricky, for example for status:
> >
> > dev->log->status = dev->status;
> >
> > What if we crash here?
> >
>
> The message will be re-sent by the kernel if it's not replied. But I
> think it would be better if we can restore it via some ioctl.

Yes, the point is, without a get ioctl, it's very hard to audit the code.

Thanks

>
> Thanks,
> Yongji
>