Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: Jason Gunthorpe
Date: Tue Jun 08 2021 - 14:36:38 EST


On Tue, Jun 08, 2021 at 10:44:31AM +1000, David Gibson wrote:

> When you say "not using a drivers/iommu IOMMU interface" do you
> basically mean the device doesn't do DMA?

No, I mean the device doesn't use iommu_map() to manage the DMA
mappings.

vfio_iommu_type1 has a special code path that mdev triggers that
doesn't allocate an IOMMU domain and doesn't call iommu_map() or
anything related to that.

Instead a mdev driver calls vfio_pin_pages() which "reads" a fake page
table and returns back the CPU pages for the mdev to DMA map however
it likes.

> Now, we could represent those different sorts of isolation separately,
> but at the time our thinking was that we should group together devices
> that can't be safely isolated for *any* reason, since the practical
> upshot is the same: you can't safely split those devices between
> different owners.

It is fine, but the direction is going the other way, devices have
perfect ioslation and rely on special interactions with the iommu to
get it.

> > What I don't like is forcing certain things depending on how the
> > vfio_device was created - for instance forcing a IOMMU group as part
> > and forcing an ugly "SW IOMMU" mode in the container only as part of
> > mdev_device.
>
> I don't really see how this is depending on how the device is
> created.

static int vfio_iommu_type1_attach_group(void *iommu_data,
struct iommu_group
*iommu_group)
{
if (vfio_bus_is_mdev(bus)) {

What the iommu code does depends on how the device was created. This
is really ugly.

This is happening becaus the three objects in the model:
driver/group/domain are not being linked together in a way that
reflects the modern world.

The group has no idea what the driver wants but is in charge of
creating the domain on behalf of the device.

And so people have been created complicated hackery to pass
information from the driver to the group, through the device, so that
the group can create the right domain.

I want to see the driver simply create the right domain directly. It
is much simpler and scales to more domain complexity.

Jason