Re: [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver.

From: Christoph Hellwig
Date: Tue Dec 06 2022 - 08:58:26 EST


On Tue, Dec 06, 2022 at 09:44:08AM -0400, Jason Gunthorpe wrote:
> Not speaking to NVMe - but this driver is clearly copying mlx5's live
> migration driver, almost completely - including this basic function.

Maybe that's not a good idea in an NVMe environment, and maybe that
should have talked to the standards committee before spending their
time on cargo cult engineering.

Most importantly NVMe is very quiet on the relationship between
VFs and PFs, and there is no way to guarantee that a PF is, at the
NVMe level, much in control of a VF at all. In other words this
concept really badly breaks NVMe abstractions.

> Thus, mxl5 has the same sort of design where the VF VFIO driver
> reaches into the PF kernel driver and asks the PF driver to perform
> some commands targeting the PF's own VFs. The DMA is then done using
> the RID of the PF, and reaches the kernel owned iommu_domain of the
> PF. This way the entire operation is secure aginst meddling by the
> guest.

And the works for you because you have a clearly defined relationship.
In NVMe we do not have this. We'd either need to define a way
to query that relationship or find another way to deal with the
problem. But in doubt the best design would be to drive VF
live migration entirely from the PF, turn the lookup from controlled
function to controlling function upside down, that is a list of
controlled functions (which could very well be, and in some designs
are, additional PFs and not VFs) by controlling function. In fact
NVMe already has that list in it's architecture with the
"Secondary Controller List".