Re: [patch] Performance Counters for Linux, v4

From: Vince Weaver
Date: Mon Dec 15 2008 - 16:02:43 EST


Hello

I'm trying a more complicated benchmark and getting even stranger
results.

This is still on the Q6600 machine

The benchmark does a loop, reading some memory. It should have
roughly:
12295 instructions
4096 memory loads
4096 branches

perfmon3 is close on all of these stats, and this is consistent
across runs with a small variation (+/- 3 or so).

The timec program returns 0 (!) for all of the stats except
retired instruction count! And with certain combinations
of counters I get 0 for all counts. No error messages are printed.

Is this expected behavior?

The test program can be had from:
http://www.csl.cornell.edu/~vince/projects/perf_counter/


Details below:

#
# Perfmon results
#

# First, trying to read all 5 events at once fails, only 4 counters # avail

tasse:~/assembly_tests% pfmon -e INSTRUCTIONS_RETIRED,BRANCH_INSTRUCTIONS_RETIRED,L1D_ALL_CACHE_REF,MEM_LOAD_RETIRED:L1D_MISS ./read_test
cannot configure events: set0 events incompatible or too many events

# Cache results are close to expected, L1D looks a little high

tasse:~/assembly_tests% pfmon -e INSTRUCTIONS_RETIRED,L1D_ALL_CACHE_REF,MEM_LOAD_RETIRED:L1D_MISS ./read_test
12299 INSTRUCTIONS_RETIRED
4164 L1D_ALL_CACHE_REF
4 MEM_LOAD_RETIRED:L1D_MISS

# Branch results. Close to what they should be, though a bit higher
# than expected.

tasse:~/assembly_tests% pfmon -e INSTRUCTIONS_RETIRED,BRANCH_INSTRUCTIONS_RETIRED,MISPREDICTED_BRANCH_RETIRED ./read_test
12299 INSTRUCTIONS_RETIRED
4102 BRANCH_INSTRUCTIONS_RETIRED
1 MISPREDICTED_BRANCH_RETIRED


#
# performance counter v4
#

# Including all stats gives no errors, but gives no results
# either


tasse:~/assembly_tests% ./timec -e 0 -e 1 -e 2 -e 3 -e 4 -e 5 ./read_test

Performance counter stats for './read_test':

0.716 task clock ticks (millisecs)

85049 cycles (events)
0 instructions (events)
0 cache references (events)
0 cache misses (events)
0 branches (events)
0 branch misses (events)



#
# If I include the cycles count, I consistently get 0 # for all counts???

tasse:~/assembly_tests% ./timec -e 0 -e 1 -e 2 -e 3 ./read_test

Performance counter stats for './read_test':

0.520 task clock ticks (millisecs)

73833 cycles (events)
0 instructions (events)
0 cache references (events)
0 cache misses (events)

#
# If I drop the cycles count, I get an instruction count
# with a value 2300 too high (see previous e-mail)
# And really low cache values.

tasse:~/assembly_tests% ./timec -e 1 -e 2 -e 3 ./read_test

Performance counter stats for './read_test':

0.723 task clock ticks (millisecs)

14644 instructions (events)
8 cache references (events)
0 cache misses (events)


#
# And the branch stats don't work either
#

tasse:~/assembly_tests% ./timec -e 1 -e 4 -e 5 ./read_test

Performance counter stats for './read_test':

0.711 task clock ticks (millisecs)

14643 instructions (events)
0 branches (events)
0 branch misses (events)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/