Re: [RFC v2] /dev/iommu uAPI proposal

From: Raj, Ashok
Date: Thu Jul 15 2021 - 12:24:04 EST


On Thu, Jul 15, 2021 at 12:23:25PM -0300, Jason Gunthorpe wrote:
> On Thu, Jul 15, 2021 at 06:57:57AM -0700, Raj, Ashok wrote:
> > On Thu, Jul 15, 2021 at 09:48:13AM -0300, Jason Gunthorpe wrote:
> > > On Thu, Jul 15, 2021 at 06:49:54AM +0000, Tian, Kevin wrote:
> > >
> > > > No. You are right on this case. I don't think there is a way to
> > > > differentiate one mdev from the other if they come from the
> > > > same parent and attached by the same guest process. In this
> > > > case the fault could be reported on either mdev (e.g. the first
> > > > matching one) to get it fixed in the guest.
> > >
> > > If the IOMMU can't distinguish the two mdevs they are not isolated
> > > and would have to share a group. Since group sharing is not supported
> > > today this seems like a non-issue
> >
> > Does this mean we have to prevent 2 mdev's from same pdev being assigned to
> > the same guest?
>
> No, it means that the IOMMU layer has to be able to distinguish them.

Ok, the guest has no control over it, as it see 2 separate pci devices and
thinks they are all different.

Only time when it can fail is during the bind operation. From guest
perspective a bind in vIOMMU just turns into a write to local table and a
invalidate will cause the host to update the real copy from the shadow.

There is no way to fail the bind? and Allocation of the PASID is also a
separate operation and has no clue how its going to be used in the guest.

>
> This either means they are "SW mdevs" which does not involve the IOMMU
> layer and puts both the responsibility for isolation and idenfication
> on the mdev driver.

When you mean SW mdev, is it the GPU like case where mdev is purely a SW
construct? or SIOV type where RID+PASID case?

>
> Or they are some "PASID mdev" which does allow the IOMMU to isolate
> them.
>
> What can't happen is to comingle /dev/iommu control over the pdev
> between two mdevs.
>
> ie we can't talk about faults for IOMMU on SW mdevs - faults do not
> come from the IOMMU layer, they have to come from inside the mdev it
> self, somehow.

Recoverable faults for guest needs to be sent to guest? A page-request from
mdev1 and from mdev2 will both look alike when the process is sharing it.

Do we have any isolation requirements here? its the same process. So if the
page-request it sent to guest and even if you report it for mdev1, after
the PRQ is resolved by guest, the request from mdev2 from the same guest
should simply work?