Re: [PATCH 8/9] vfio/pci: use x86 naming instead of igd

From: Max Gurtovoy
Date: Wed Feb 03 2021 - 11:11:14 EST



On 2/3/2021 5:24 AM, Alexey Kardashevskiy wrote:


On 03/02/2021 04:41, Max Gurtovoy wrote:

On 2/2/2021 6:06 PM, Cornelia Huck wrote:
On Mon, 1 Feb 2021 11:42:30 -0700
Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:

On Mon, 1 Feb 2021 12:49:12 -0500
Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:

On 2/1/21 12:14 PM, Cornelia Huck wrote:
On Mon, 1 Feb 2021 16:28:27 +0000
Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote:
This patch doesn't change any logic but only align to the concept of
vfio_pci_core extensions. Extensions that are related to a platform
and not to a specific vendor of PCI devices should be part of the core
driver. Extensions that are specific for PCI device vendor should go
to a dedicated vendor vfio-pci driver.
My understanding is that igd means support for Intel graphics, i.e. a
strict subset of x86. If there are other future extensions that e.g.
only make sense for some devices found only on AMD systems, I don't
think they should all be included under the same x86 umbrella.

Similar reasoning for nvlink, that only seems to cover support for some
GPUs under Power, and is not a general platform-specific extension IIUC.

We can arguably do the zdev -> s390 rename (as zpci appears only on
s390, and all PCI devices will be zpci on that platform), although I'm
not sure about the benefit.
As far as I can tell, there isn't any benefit for s390 it's just
"re-branding" to match the platform name rather than the zdev moniker,
which admittedly perhaps makes it more clear to someone outside of s390
that any PCI device on s390 is a zdev/zpci type, and thus will use this
extension to vfio_pci(_core).  This would still be true even if we added
something later that builds atop it (e.g. a platform-specific device
like ism-vfio-pci).  Or for that matter, mlx5 via vfio-pci on s390x uses
these zdev extensions today and would need to continue using them in a
world where mlx5-vfio-pci.ko exists.

I guess all that to say: if such a rename matches the 'grand scheme' of
this design where we treat arch-level extensions to vfio_pci(_core) as
"vfio_pci_(arch)" then I'm not particularly opposed to the rename.  But
by itself it's not very exciting :)
This all seems like the wrong direction to me.  The goal here is to
modularize vfio-pci into a core library and derived vendor modules that
make use of that core library.  If existing device specific extensions
within vfio-pci cannot be turned into vendor modules through this
support and are instead redefined as platform specific features of the
new core library, that feels like we're already admitting failure of
this core library to support known devices, let alone future devices.

IGD is a specific set of devices.  They happen to rely on some platform
specific support, whose availability should be determined via the
vendor module probe callback.  Packing that support into an "x86"
component as part of the core feels not only short sighted, but also
avoids addressing the issues around how userspace determines an optimal
module to use for a device.
Hm, it seems that not all current extensions to the vfio-pci code are
created equal.

IIUC, we have igd and nvlink, which are sets of devices that only show
up on x86 or ppc, respectively, and may rely on some special features
of those architectures/platforms. The important point is that you have
a device identifier that you can match a driver against.

maybe you can supply the ids ?

Alexey K, I saw you've been working on the NVLINK2 for P9. can you supply the exact ids for that should be bounded to this driver ?

I'll add it to V3.

Sorry no, I never really had the exact ids, they keep popping up after years.

The nvlink2 support can only work if the platform supports it as there is nothing to do to the GPUs themselves, it is about setting up those nvlink2 links. If the platform advertises certain features in the device tree - then any GPU in the SXM2 form factor (not sure about the exact abbrev, not an usual PCIe device) should just work.

If the nvlink2 "subdriver" fails to find nvlinks (the platform does not advertise them or some parameters are new to this subdriver, such as link-speed), we still fall back to generic vfio-pci and try passing through this GPU as a plain PCI device with no extra links. Semi-optimal but if the user is mining cryptocoins, then highspeed links are not really that important :) And the nvidia driver is capable of running such GPUs without nvlinks. Thanks,

From the above I can conclude that nvlink2_vfio_pci will need to find nvlinks during the probe and fail in case it doesn't.

which platform driver is responsible to advertise it ? can we use aux_device to represent these nvlink/nvlinks ?

In case of a failure, the user can fallback to vfio-pci.ko and run without the nvlinks as you said.

This is deterministic behavior and the user will know exactly what he's getting from vfio subsystem.





On the other side, we have the zdev support, which both requires s390
and applies to any pci device on s390. If we added special handling for
ISM on s390, ISM would be in a category similar to igd/nvlink.

Now, if somebody plugs a mlx5 device into an s390, we would want both
the zdev support and the specialized mlx5 driver. Maybe zdev should be
some kind of library that can be linked into normal vfio-pci, into
vfio-pci-mlx5, and a hypothetical vfio-pci-ism? You always want zdev on
s390 (if configured into the kernel).