Re: [RFC] Device isolation infrastructure v2

From: David Gibson
Date: Mon Dec 19 2011 - 19:25:25 EST


On Mon, Dec 19, 2011 at 10:56:40PM +0000, David Woodhouse wrote:
> On Tue, 2011-12-20 at 09:31 +1100, David Gibson wrote:
> > When we're running paravirtualized under pHyp, it's impossible to
> > merge multiple PEs into one domain per se. We could fake it rather
> > nastily by replicating all map/unmaps across mutiple PEs. When
> > running bare metal, we could do so a bit more nicely by assigning
> > multiple PEs the same TCE pointer, but we have no mechanism to do so
> > at present.
>
> VT-d does share the page tables, as you could on bare metal. But it's an
> implementation detail â there's nothing *fundamentally* wrong with
> having to do the map/unmap for each PE, is there? It's only at VM setup
> time, so it doesn't really matter if it's slow.
>
> Surely that's the only way you're going to present the guest with the
> illusion of having no IOMMU; so that DMA to any given guest physical
> address "just works".
>
> On the other hand, perhaps you don't want to do that at all. Perhaps
> you're better off presenting a virtualised IOMMU to the guest and
> *insisting* that it fully uses it in order to do any DMA at all?

Not only do we want to, we more or less *have* to. Existing kernels,
which are used to being paravirt under phyp expect and need a paravirt
iommu. DMA without iommu setup just doesn't happen. And the
map/unmap hypercalls are frequently a hot path, so slow does matter.

The other problem is that each domain's IOVA window is often fairly
small, a limitation that would get even worse if we try to put too
many devices in there.

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/