Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it

From: Si-Wei Liu
Date: Wed Jul 19 2023 - 20:03:56 EST




On 7/19/2023 3:26 PM, Michael S. Tsirkin wrote:
On Wed, Jul 19, 2023 at 03:20:03PM -0700, Si-Wei Liu wrote:

On 7/5/2023 11:07 PM, Michael S. Tsirkin wrote:
On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
With the current code it is accepted as long as userland send it.

Although userland should not set a feature flag that has not been
offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
complain for it.

Since there is no specific reason for any parent to reject that backend
feature bit when it has been proposed, let's control it at vdpa frontend
level. Future patches may move this control to the parent driver.

Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
Signed-off-by: Eugenio Pérez <eperezma@xxxxxxxxxx>
Please do send v3. And again, I don't want to send "after driver ok" hack
upstream at all, I merged it in next just to give it some testing.
We want RING_ACCESS_AFTER_KICK or some such.

Current devices do not support that semantic.
Which devices specifically access the ring after DRIVER_OK but before
a kick?
The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
set. And I'm told that our VQ activity should start without a kick.

Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
but I believe it could be added without too much trouble.

sln

OK it seems clear at least in the current version pds needs
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
However can we also code up the RING_RESET path as the default?
What's the rationale of making RING_RESET path the default? Noted this is on
a performance critical path (for live migration downtime), did we ever get
consensus from every or most hardware vendors that RING_RESET has lower cost
in terms of latency overall than ENABLE_AFTER_DRIVER_OK? I think (RING)RESET
in general falls on the slow path for hardware, while I assume either
RING_RESET or ENABLE_AFTER_DRIVER_OK doesn't matters much on software backed
vdpa e.g. vp_vdpa. Maybe should make ENABLE_AFTER_DRIVER_OK as the default?

-Siwei
Coming from the spec RING_RESET has clearer semantics.
Spec doesn't have clearer semantics on vdpa specifics - such as how does RING_RESET interoperate with ASID?
As long as we support it it is not critical which one
is the default though.
The point is vdpa vendor drivers may implement RING_RESET for a different purpose than live migration. In case they support both I don't see a reason why it has to fallback to a slower path given there's a faster path. May we should leave this to vendor driver to decide, but I am not sure.

-Siwei



Then down the road vendors can choose what to do.





Previous versions of the QEMU LM series did a spurious kick to start
traffic at the LM destination [1]. When it was proposed, that spurious
kick was removed from the series because to check for descriptors
after driver_ok, even without a kick, was considered work of the
parent driver.

I'm ok to go back to this spurious kick, but I'm not sure if the hw
will read the ring before the kick actually. I can ask.

Thanks!

[1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.

But this reminds me one thing, as the thread is going too long, I
wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
supported?

The problem with that is that the device needs to support all
RING_RESET, like to be able to change vq address etc after DRIVER_OK.
Not all HW support it.

We just need the subset of having the dataplane freezed until all CVQ
commands have been consumed. I'm sure current vDPA code already
supports it in some devices, like MLX and PSD.

Thanks!

Thanks


My plan was to convert
it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
was not explicit enough.

The only solution I can see to that is to trap & emulate in the vdpa
(parent?) driver, as talked in virtio-comment. But that complicates
the architecture:
* Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
* Store vq enable state separately, at
vdpa->config->set_vq_ready(true), but not transmit that enable to hw
* Store the doorbell state separately, but do not configure it to the
device directly.

But how to recover if the device cannot configure them at kick time,
for example?

Maybe we can just fail if the parent driver does not support enabling
the vq after DRIVER_OK? That way no new feature flag is needed.

Thanks!

---
Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
commit. Please let me know if I should send a v3 of [1] instead.

[1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@xxxxxxxxxx/T/
---
drivers/vhost/vdpa.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index e1abf29fed5b..a7e554352351 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
{
struct vhost_vdpa *v = filep->private_data;
struct vhost_dev *d = &v->vdev;
+ const struct vdpa_config_ops *ops = v->vdpa->config;
void __user *argp = (void __user *)arg;
u64 __user *featurep = argp;
- u64 features;
+ u64 features, parent_features = 0;
long r = 0;

if (cmd == VHOST_SET_BACKEND_FEATURES) {
if (copy_from_user(&features, featurep, sizeof(features)))
return -EFAULT;
+ if (ops->get_backend_features)
+ parent_features = ops->get_backend_features(v->vdpa);
if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
BIT_ULL(VHOST_BACKEND_F_RESUME) |
- BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
+ parent_features))
return -EOPNOTSUPP;
if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
!vhost_vdpa_can_suspend(v))
--
2.39.3
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization