Re: [PATCH v7 1/1] vfio/nvgpu: Add vfio pci variant module for grace hopper

From: Alex Williamson
Date: Thu Aug 31 2023 - 14:22:43 EST


On Thu, 31 Aug 2023 07:04:10 -0700
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> On Thu, Aug 31, 2023 at 01:51:11PM +0000, Ankit Agrawal wrote:
> > Hi Christoph,
> >
> > >Whats the actual consumer running in a qemu VM here?
> > The primary use case in the VM is to run the open source Nvidia
> > driver (https://github.com/NVIDIA/open-gpu-kernel-modules)
> > and workloads.
>
> So this infrastructure to run things in a VM that we don't even support
> in mainline? I think we need nouveau support for this hardware in the
> drm driver first, before adding magic vfio support.

There's really never a guarantee that the thing we're exposing via the
vfio uAPI has mainline drivers, for example we don't consult the
nouveau device table before we expose an NVIDIA GPU to a Windows guest
running proprietary device drivers.

We've also never previously made a requirement that any new code in
vfio must directly contribute to supporting a mainline driver, in fact
I think you'll find examples where we do have such code.

This driver is proposing to expose a coherent memory region associated
with the device, composed as a PCI BAR, largely to bring it into the
vfio device model. Access to that memory region is still pass-through.
This is essentially behavior that we also enable though mdev drivers
like kvmgt (modulo the coherent aspect).

I assume the above driver understands how to access and make use of
this coherent memory whether running bare-metal or virtualized, so
potentially we have some understanding of how it's used by the driver,
which can't be said for all devices used with vfio. I'm therefore not
sure how we can suddenly decide to impose a mainline driver requirement
for exposing a device to userspace. Thanks,

Alex