Re: [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver.

From: Keith Busch
Date: Tue Dec 06 2022 - 08:51:47 EST


On Tue, Dec 06, 2022 at 09:44:08AM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 06, 2022 at 07:19:40AM +0100, Christoph Hellwig wrote:
> > On Tue, Dec 06, 2022 at 01:58:12PM +0800, Lei Rao wrote:
> > > The new function nvme_submit_vf_cmd() helps the host VF driver to issue
> > > VF admin commands. It's helpful in some cases that the host NVMe driver
> > > does not control VF's admin queue. For example, in the virtualization
> > > device pass-through case, the VF controller's admin queue is governed
> > > by the Guest NVMe driver. Host VF driver relies on PF device's admin
> > > queue to control VF devices like vendor-specific live migration commands.
> >
> > WTF are you even smoking when you think this would be acceptable?
>
> Not speaking to NVMe - but this driver is clearly copying mlx5's live
> migration driver, almost completely - including this basic function.
>
> So, to explain why mlx5 works this way..
>
> The VFIO approach is to fully assign an entire VF to the guest OS. The
> entire VF assignment means every MMIO register *and all the DMA* of
> the VF is owned by the guest operating system.
>
> mlx5 needs to transfer hundreds of megabytes to gigabytes of in-device
> state to perform a migration.

For storage, though, you can't just transfer the controller state. You have to
transfer all the namespace user data, too. So potentially many terabytes?