RFC [PATCH] x86/pci: reserve extra page to avoid error caused by P2P pref DMA reads

From: Yinghai Lu
Date: Wed Aug 27 2008 - 03:30:29 EST


Diag guys, found one system when loading is high, will have gart wark error.
root cause is P2P bridge try to prefetch for several intel e1000 under
it. and that skb is near GART iommu area.

try to reserve page in the boundary at first.
last page near TOM2, and last page near MMIO
also gart first and last page.

need one better way for all arch support PCI and memory with a lot of holes etc.

Signed-off-by: Yinghai Lu <yhlu.kernel@xxxxxxxxx>
Cc: Pavel Machek <pavel@xxxxxxx>
Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
Cc: Jesse Barnes<jbarnes@xxxxxxxxxxxxxxxx>

---
arch/x86/kernel/pci-dma.c | 28 ++++++++++++++++++++++++++++
arch/x86/kernel/pci-gart_64.c | 6 ++++++
2 files changed, 34 insertions(+)

Index: linux-2.6/arch/x86/kernel/pci-dma.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/pci-dma.c
+++ linux-2.6/arch/x86/kernel/pci-dma.c
@@ -72,12 +72,40 @@ static int __init parse_dma32_size_opt(c
}
early_param("dma32_size", parse_dma32_size_opt);

+static void __init reserve_last_page(unsigned long pfn)
+{
+ unsigned long phys;
+ void *ptr;
+
+ phys = (pfn - 1)<<PAGE_SHIFT;
+ ptr = __alloc_bootmem_nopanic(PAGE_SIZE, PAGE_SIZE, phys);
+
+ if (!ptr || virt_to_phys(ptr) != phys)
+ printk(KERN_WARNING "Can not hold last page near %lx for workaround P2P pref DMA reads!\n", phys);
+ else
+ printk(KERN_WARNING "Last page is reserved near %lx for workaround P2P pref DMA reads!\n", phys);
+}
void __init dma32_reserve_bootmem(void)
{
unsigned long size, align;
+
+ /*
+ * try to reserve last page to workaround P2P bridge pref DMA reads
+ * normally don't need to reserve the page near mmio,
+ * because always has acpi etc sit there.
+ * but some system has that acpi in the middle of ram below 4g
+ * so just reserve it.
+ */
+ if (max_low_pfn_mapped < max_pfn_mapped)
+ reserve_last_page(max_low_pfn_mapped);
+
+ /* less than 4G, don't need iommu */
if (max_pfn <= MAX_DMA32_PFN)
return;

+ /* try to reserve last page to workaround P2P bridge pref DMA reads */
+ reserve_last_page(max_pfn);
+
/*
* check aperture_64.c allocate_aperture() for reason about
* using 512M as goal
Index: linux-2.6/arch/x86/kernel/pci-gart_64.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/pci-gart_64.c
+++ linux-2.6/arch/x86/kernel/pci-gart_64.c
@@ -826,6 +826,9 @@ void __init gart_iommu_init(void)
*/
set_bit_string(iommu_gart_bitmap, 0, EMERGENCY_PAGES);

+ /* reserve one page at tail, for P2P bridge pref DMA reads */
+ set_bit_string(iommu_gart_bitmap, iommu_pages - 1, 1);
+
agp_memory_reserved = iommu_size;
printk(KERN_INFO
"PCI-DMA: Reserving %luMB of IOMMU area in the AGP aperture\n",
@@ -870,6 +873,9 @@ void __init gart_iommu_init(void)
for (i = EMERGENCY_PAGES; i < iommu_pages; i++)
iommu_gatt_base[i] = gart_unmapped_entry;

+ /* we need set unmapped on head too, for P2P bridge pref DMA reads */
+ iommu_gatt_base[0] = gart_unmapped_entry;
+
flush_gart();
dma_ops = &gart_dma_ops;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/