Re: [PATCH 2/3] x86/cpa: Use pte_attrs instead of pte_flags onCPA/set_p.._wb/wc operations.

From: Konrad Rzeszutek Wilk
Date: Sat Dec 03 2011 - 09:42:25 EST


> The fix, which this patch proposes, is to wrap the pte_pgprot in the CPA
> code with newly introduced pte_attrs which can go through the pvops interface
> to get the "emulated" value instead of the raw. Naturally if CONFIG_PARAVIRT is
> not set, it would end calling native_pte_val.
>
> The other way to fix this is by wrapping pte_flags and go through the pvops
> interface and it really is the Right Thing to do. The problem is, that past
> experience with mprotect stuff demonstrates that it be really expensive in inner
> loops, and pte_flags() is used in some very perf-critical areas.

I did not get to verify the mprotect stuff as I need to chase down the details of it,
but I did run some benchmarks using kernbench on three different boxes:

AMD A8-3850 (8GB) - tst005
Intel i3-2100 (8GB) - tst007
Nehelem EX (32logical cpus) (32GB) - tst010

I've put all the kernebench results in https://www.dumpdata.com/results/baseline_pte_flags_pte_attrs/
(and the chart for the AMD is attached).

The boxes have a fresh install of F16, with a 3.2-rc3 variant kernel using the
.config that F16 came with. I just hit Enter when oldconfig asked me to choose.

The baseline is virgin v3.2-rc3. The pte_attrs is the patch that this email is
replaying too (on top of v3.2-rc3). The pte_flags are two patches that wrap pte_flags
in paravirt and use alternative_asm to patch the code (on top of v3.2-rc3).

The patches are in the URL mentioned or in my git branch as
devel/pte_attrs.v1 ( git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git).
I am also attaching them in this email.

The summary is that I could only get the numbers to show some difference when
the maximum load was run - and _only_ on the AMD machine. The small SandyBridge
and the big SandyBridge had no trouble with. The AMD machine the difference was
13% worst if pte_flags (so alternative_asm) was used instead of pte_attrs.

The way I did these tests is to bootup with 'init=/bin/bash', remount / as rw, activate
swap disk and run kernbench on the v3.2-rc3 linux tree. Then unplug the machine for a tea
break and then repeat the cycle with a different kernel.

Attachment: AMD-A8-3850.png
Description: PNG image