PAGE_CACHE_WC strikes again

From: Eric Anholt
Date: Tue Mar 31 2009 - 20:11:39 EST


I just tracked down what was cutting performance 10x on one of my
systems on a microbenchmark I'd just written:

--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -540,7 +540,7 @@ int drm_gem_mmap(struct file *filp, struct
vm_area_struct *vma)
/* FIXME: use pgprot_writecombine when available */
prot = pgprot_val(vma->vm_page_prot);
#ifdef CONFIG_X86
- prot |= _PAGE_CACHE_WC;
+ /*prot |= _PAGE_CACHE_WC;*/
#endif
vma->vm_page_prot = __pgprot(prot);

Turns out that setting PAGE_CACHE_WC disables the WC effect of the MTRR
on my non-PAT (disabled due to CPU errata) 945GM system, and this
workaround took GTT-mapped writes from 120MB/s to 1180MB/s.

What's the right way to be setting our PTEs? This is similar to the
workaround in ef5fa0ab24b87646c7bc98645acbb4b51fc2acd4, but to do it in
the driver as well means exporting pat_enabled, and it really seems like
PAT presence shouldn't be something the driver has to worry about --
I've got a WC MTRR and I'm asking for a WC mapping and I'm getting
uncached. If I failed to init the MTRR, the state of the aperture I'm
mapping would be uncached.

Test code is at
git://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools
under benchmarks/intel_upload_blit_large_gtt but requires libdrm master
until we get a release out in the next week or so. Also includes a few
standalone regression tests, which may be useful for people making
changes that may affect i915.

--
Eric Anholt
eric@xxxxxxxxxx eric.anholt@xxxxxxxxx


Attachment: signature.asc
Description: This is a digitally signed message part