[xen/spinlock] e0fc17a9363: +24.6% aim9.dir_rtns_1.ops_per_sec

From: Fengguang Wu
Date: Mon Sep 29 2014 - 02:19:38 EST


Hi Konrad,

We are glad to find that your patch increases the aim9 test
performance by up to +24.6%!

e0fc17a936334c08b2729fff87168c03fdecf5b6 ("xen/spinlock: Don't enable them unconditionally.")

test case: brickland3/aim9/5s-all

brickland3 is an Ivy Bridge-EX with 512G memory.

c0914e61660fa7d e0fc17a936334c08b2729fff8
--------------- -------------------------
10401445 Â 0% +24.6% 12958400 Â 0% TOTAL aim9.dir_rtns_1.ops_per_sec
494461 Â 5% +8.0% 534028 Â 1% TOTAL aim9.link_test.ops_per_sec
1840820 Â 0% +4.5% 1924528 Â 0% TOTAL aim9.fifo_test.ops_per_sec
1251672 Â 0% +5.0% 1314071 Â 0% TOTAL aim9.dgram_pipe.ops_per_sec
1097240 Â 1% +5.0% 1152480 Â 0% TOTAL aim9.signal_test.ops_per_sec
1407192 Â 0% +4.8% 1474424 Â 0% TOTAL aim9.stream_pipe.ops_per_sec
1898068 Â 1% +4.7% 1987232 Â 0% TOTAL aim9.pipe_cpy.ops_per_sec
1298388 Â 1% +3.5% 1343704 Â 0% TOTAL aim9.shared_memory.ops_per_sec
64202 Â 0% -3.2% 62178 Â 1% TOTAL aim9.misc_rtns_1.ops_per_sec
-12622 Â-16% +32.4% -16710 Â-14% TOTAL sched_debug.cfs_rq[116]:/.spread0
-1977 Â-26% -59.3% -805 Â-42% TOTAL cpuidle.C3-IVB.usage
-13457 Â-8% +19.2% -16045 Â-17% TOTAL sched_debug.cfs_rq[14]:/.spread0
-7867 Â-42% +61.3% -12692 Â-10% TOTAL sched_debug.cfs_rq[18]:/.spread0
-4 Â-27% -36.4% -2 Â-34% TOTAL sched_debug.cpu#19.nr_uninterruptible
-12763 Â-8% +19.1% -15202 Â-10% TOTAL sched_debug.cfs_rq[24]:/.spread0
-13552 Â-8% +20.5% -16333 Â-14% TOTAL sched_debug.cfs_rq[97]:/.spread0
1 Â 0% -100.0% 0 Â 0% TOTAL sched_debug.cfs_rq[105]:/.nr_spread_over
1857 Â48% -69.7% 563 Â20% TOTAL sched_debug.cfs_rq[97]:/.min_vruntime
111 Â44% -73.0% 30 Â42% TOTAL sched_debug.cfs_rq[4]:/.blocked_load_avg
160 Â48% -60.5% 63 Â26% TOTAL sched_debug.cpu#104.sched_goidle
1288 Â37% -58.1% 540 Â37% TOTAL sched_debug.cfs_rq[67]:/.min_vruntime
988 Â36% -49.7% 497 Â46% TOTAL sched_debug.cpu#55.ttwu_count
114 Â20% -59.4% 46 Â24% TOTAL sched_debug.cfs_rq[48]:/.tg_load_contrib
101 Â16% -54.2% 46 Â24% TOTAL sched_debug.cfs_rq[48]:/.blocked_load_avg
32.50 Â41% -52.7% 15.37 Â40% TOTAL sched_debug.cfs_rq[66]:/.exec_clock
38 Â29% -39.3% 23 Â43% TOTAL sched_debug.cfs_rq[44]:/.avg->runnable_avg_sum
469 Â24% -40.5% 279 Â47% TOTAL sched_debug.cpu#26.ttwu_local
1754 Â34% +113.5% 3744 Â14% TOTAL sched_debug.cpu#40.sched_count
123 Â27% -49.9% 62 Â36% TOTAL sched_debug.cfs_rq[47]:/.blocked_load_avg
151 Â12% -40.8% 89 Â31% TOTAL sched_debug.cpu#93.ttwu_count
68 Â44% -49.6% 34 Â27% TOTAL sched_debug.cfs_rq[50]:/.avg->runnable_avg_sum
5584 Â20% +40.0% 7818 Â18% TOTAL sched_debug.cpu#84.nr_load_updates
411 Â30% -36.1% 262 Â24% TOTAL sched_debug.cpu#23.ttwu_local
1655 Â33% -45.4% 903 Â32% TOTAL sched_debug.cpu#2.sched_goidle
3341 Â33% -45.0% 1836 Â31% TOTAL sched_debug.cpu#2.nr_switches
8488 Â11% +24.1% 10538 Â13% TOTAL sched_debug.cpu#10.nr_load_updates
19 Â 7% +25.0% 24 Â12% TOTAL sched_debug.cpu#114.ttwu_local
8752 Â13% +23.9% 10842 Â11% TOTAL sched_debug.cpu#8.nr_load_updates
24 Â10% -16.9% 20 Â 9% TOTAL sched_debug.cpu#68.ttwu_local
901 Â38% -46.5% 482 Â40% TOTAL sched_debug.cpu#62.sched_goidle
1862 Â37% -43.9% 1045 Â35% TOTAL sched_debug.cpu#62.nr_switches
14555 Â 7% +13.4% 16512 Â 8% TOTAL sched_debug.cpu#106.nr_load_updates
5433 Â10% +20.4% 6542 Â 5% TOTAL sched_debug.cpu#0.ttwu_local

aim9.signal_test.ops_per_sec

1.16e+06 O+-O----O----------------------O-------O----O--O-O--O--O----O-O--+
| O O O O O O O O O O |
1.14e+06 ++ O O O O
| |
| |
1.12e+06 ++ |
| *..*. .*..*.. *..*..*.*..*.*..*..* |
1.1e+06 ++ : *..* *.*.. : : |
| : : : |
1.08e+06 *+. *.. : * : |
| + *.* : |
| * : |
1.06e+06 ++ * |
| |
1.04e+06 ++---------------------------------------------------------------+


aim9.dir_rtns_1.ops_per_sec

1.4e+07 ++---------------------------------------------------------------+
| O O O O O O O O O O O |
1.35e+07 ++ O |
1.3e+07 ++ O O O O O O |
O O O O O O O O
1.25e+07 ++ |
| |
1.2e+07 ++ |
| |
1.15e+07 ++ |
1.1e+07 ++ |
| |
1.05e+07 *+.*.*..*.*..*..*. .*.*..*.*..*.. .*.. |
| *..*.*..*. *.*..*.*. *.* |
1e+07 ++---------------------------------------------------------------+

[*] bisect-good sample
[O] bisect-bad sample

To reproduce:

apt-get install ruby ruby-oj
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local job.yaml

Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Fengguang
---
testcase: aim9
default_monitors:
watch-oom:
wait: pre-test
uptime:
iostat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
energy:
cpuidle:
cpufreq:
turbostat:
sched_debug:
interval: 10
pmeter:
model: Brickland Ivy Bridge-EX
nr_cpu: 120
memory: 512G
hdd_partitions:
swap_partitions:
aim9:
testtime: 5s
test:
- all
branch: linus/master
commit: 19583ca584d6f574384e17fe7613dfaeadcdc4a6
repeat_to: 3
enqueue_time: 2014-09-25 21:56:15.069539322 +08:00
testbox: brickland3
kconfig: x86_64-rhel
kernel: "/kernel/x86_64-rhel/19583ca584d6f574384e17fe7613dfaeadcdc4a6/vmlinuz-3.16.0"
user: lkp
queue: wfg
result_root: "/result/brickland3/aim9/5s-all/debian-x86_64.cgz/x86_64-rhel/19583ca584d6f574384e17fe7613dfaeadcdc4a6/0"
job_file: "/lkp/scheduled/brickland3/wfg_aim9-5s-all-x86_64-rhel-19583ca584d6f574384e17fe7613dfaeadcdc4a6-2.yaml"
dequeue_time: 2014-09-25 22:35:03.558121828 +08:00
history_time: 351.36
job_state: finished
loadavg: 0.96 0.69 0.32 1/969 35857
start_time: '1411655785'
end_time: '1411656085'
version: "/lkp/lkp/.src-20140925-212910"