Re: [PATCH] intel-iommu: Manage iommu_coherency globally

From: David Woodhouse
Date: Fri Nov 18 2011 - 07:03:50 EST


On Tue, 2011-11-15 at 21:11 -0700, Alex Williamson wrote:
> We currently manage iommu_coherency on a per domain basis,
> choosing the safest setting across the iommus attached to a
> particular domain. This unfortunately has a bug that when
> no iommus are attached, the domain defaults to coherent.
> If we fall into this mode, then later add a device behind a
> non-coherent iommu to that domain, the context entry is
> updated using the wrong coherency setting, and we get dmar
> faults.
>
> Since we expect chipsets to be consistent in their coherency
> setting, we can instead determine the coherency once and use
> it globally.

(Adding Rajesh).

Hm, it seems I lied to you about this. The non-coherent mode isn't just
a historical mistake; it's configurable by the BIOS, and we actually
encourage people to use the non-coherent mode because it makes the
hardware page-walk faster â so reduces the latency for IOTLB misses.

In addition to that, the IOMMU associated with the integrated graphics
is so "special" that it doesn't support coherent mode either. So it *is*
quite feasible that we'll see a machine where some IOMMUs support
coherent mode, and some don't.

And thus we do need to address the concern that just assuming
non-coherent mode will cause unnecessary performance issues, for the
case where a domain *doesn't* happen to include any of the non-coherent
IOMMUs.

However... for VM domains I don't think we care. Setting up the page
tables *isn't* a fast path there (at least not until/unless we support
exposing an emulated IOMMU to the guest).

The case we care about is *native* DMA, where this cache flush will be
happening for example in the fast path of network TX/RX. But in *that*
case, there is only *one* IOMMU to worry about so it's simple enough to
do the right thing, surely?

So, referring to your patch... perhaps we should retain the
domain->iommu_coherency variable, but *still* have the global
'iommu_coherency' calculated at init time. For *VM* domains, we set
domain->iommu_coherency = iommu_coherency and use the
globally-calculated (and pessimistic) result. For *DMA* domains (i.e. in
domain_init()), we keep the code that was there before, which looks at
ecap_coherent(iommu->ecap). Does that seem sane? Revised patch...

From c6b04e1c7456892f43a6f81f0005b3ade8ca0383 Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson@xxxxxxxxxx>
Date: Tue, 15 Nov 2011 21:11:09 -0700
Subject: [PATCH] intel-iommu: Manage iommu_coherency globally

We currently manage iommu_coherency on a per domain basis,
choosing the safest setting across the iommus attached to a
particular domain. This unfortunately has a bug that when
no iommus are attached, the domain defaults to coherent.
If we fall into this mode, then later add a device behind a
non-coherent iommu to that domain, the context entry is
updated using the wrong coherency setting, and we get dmar
faults.

Since we expect chipsets to be consistent in their coherency
setting, we can instead determine the coherency once and use
it globally.

Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
[dwmw2: Do the above for VM domains only; use capabilities for normal domains]
---
drivers/iommu/intel-iommu.c | 33 +++++++++++++++++----------------
1 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c0c7820..ce8f933 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -419,6 +419,8 @@ static LIST_HEAD(device_domain_list);

static struct iommu_ops intel_iommu_ops;

+static int vm_iommu_coherency __read_mostly = 1;
+
static int __init intel_iommu_setup(char *str)
{
if (!str)
@@ -556,20 +558,6 @@ static struct intel_iommu *domain_get_iommu(struct dmar_domain *domain)
return g_iommus[iommu_id];
}

-static void domain_update_iommu_coherency(struct dmar_domain *domain)
-{
- int i;
-
- domain->iommu_coherency = 1;
-
- for_each_set_bit(i, &domain->iommu_bmp, g_num_of_iommus) {
- if (!ecap_coherent(g_iommus[i]->ecap)) {
- domain->iommu_coherency = 0;
- break;
- }
- }
-}
-
static void domain_update_iommu_snooping(struct dmar_domain *domain)
{
int i;
@@ -608,7 +596,6 @@ static void domain_update_iommu_superpage(struct dmar_domain *domain)
/* Some capabilities may be different across iommus */
static void domain_update_iommu_cap(struct dmar_domain *domain)
{
- domain_update_iommu_coherency(domain);
domain_update_iommu_snooping(domain);
domain_update_iommu_superpage(domain);
}
@@ -3583,6 +3570,8 @@ static struct notifier_block device_nb = {

int __init intel_iommu_init(void)
{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
int ret = 0;

/* VT-d is required for a TXT/tboot launch, so enforce that */
@@ -3643,6 +3632,18 @@ int __init intel_iommu_init(void)

init_iommu_pm_ops();

+ /*
+ * Check for coherency support across all iommus. If unsupported
+ * by any, force cache flushes. Expected to be consistent across
+ * all iommus in the system.
+ */
+ for_each_active_iommu(iommu, drhd) {
+ if (!ecap_coherent(iommu->ecap)) {
+ vm_iommu_coherency = 0;
+ break;
+ }
+ }
+
bus_set_iommu(&pci_bus_type, &intel_iommu_ops);

bus_register_notifier(&pci_bus_type, &device_nb);
@@ -3820,7 +3821,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
INIT_LIST_HEAD(&domain->devices);

domain->iommu_count = 0;
- domain->iommu_coherency = 0;
+ domain->iommu_coherency = vm_iommu_coherency;
domain->iommu_snooping = 0;
domain->iommu_superpage = 0;
domain->max_addr = 0;
--
1.7.7.1



--
dwmw2

Attachment: smime.p7s
Description: S/MIME cryptographic signature