Re: [PATCH v17 3/3] vfio/nvgrace-gpu: Add vfio pci variant module for grace hopper

From: Alex Williamson
Date: Fri Feb 09 2024 - 10:55:57 EST


On Fri, 9 Feb 2024 09:20:22 +0000
Ankit Agrawal <ankita@xxxxxxxxxx> wrote:

> Thanks Kevin for the review. Comments inline.
>
> >>
> >> Note that the usemem memory is added by the VM Nvidia device driver [5]
> >> to the VM kernel as memblocks. Hence make the usable memory size
> >> memblock
> >> aligned.
> >
> > Is memblock size defined in spec or purely a guest implementation choice?
>
> The MEMBLOCK value is a hardwired and a constant ABI value between the GPU
> FW and VFIO driver.
>
> >>
> >> If the bare metal properties are not present, the driver registers the
> >> vfio-pci-core function pointers.
> >
> > so if qemu doesn't generate such property the variant driver running
> > inside guest will always go to use core functions and guest vfio userspace
> > will observe both resmem and usemem bars. But then there is nothing
> > in field to prohibit mapping resmem bar as cacheable.
> >
> > should this driver check the presence of either ACPI property or
> > resmem/usemem bars to enable variant function pointers?
>
> Maybe I am missing something here; but if the ACPI property is absent,
> the real physical BARs present on the device will be exposed by the
> vfio-pci-core functions to the VM. So I think if the variant driver is ran
> within the VM, it should not see the fake usemem and resmem BARs.

There are two possibilities here, either we're assigning the pure
physical device from a host that does not have the ACPI properties or
we're performing a nested assignment. In the former case we're simply
passing along the unmodified physical BARs. In the latter case we're
actually passing through the fake BARs, the virtualization of the
device has already happened in the level 1 assignment.

I think Kevin's point is also relative to this latter scenario, in the
L1 instance of the nvgrace-gpu driver the mmap of the usemem BAR is
cachable, but in the L2 instance of the driver where we only use the
vfio-pci-core ops nothing maintains that cachable mapping. Is that a
problem? An uncached mapping on top of a cachable mapping is often
prone to problems. Thanks,

Alex