Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view

From: Taeung Song
Date: Fri Jun 30 2017 - 03:14:58 EST


Hi,

On 06/29/2017 04:11 PM, Namhyung Kim wrote:
Hello,

On Wed, Jun 28, 2017 at 11:53:22AM +0200, Milian Wolff wrote:
On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
Hi,

The --source-only option and new source code TUI view can show
the result of performance analysis based on full source code per
symbol(function). (Namhyung Kim told me this idea and it was also requested
by others some time ago..)

If someone wants to see the cause, he/she will need to dig into the asm.
But before that, looking at the source level can give a hint or clue
for the problem.

For example, if target symbol is 'hex2u64' of util/util.c,
the output is like below.

$ perf annotate --source-only --stdio -s hex2u64
Percent | Source code of util.c for cycles:ppp (42 samples)
-----------------------------------------------------------------
0.00 : 354 * While we find nice hex chars, build a long_val.
0.00 : 355 * Return number of chars processed.
0.00 : 356 */
0.00 : 357 int hex2u64(const char *ptr, u64 *long_val)
2.38 : 358 {
2.38 : 359 const char *p = ptr;
0.00 : 360 *long_val = 0;
0.00 : 361
30.95 : 362 while (*p) {
23.81 : 363 const int hex_val = hex(*p);
0.00 : 364
14.29 : 365 if (hex_val < 0)
0.00 : 366 break;
0.00 : 367
26.19 : 368 *long_val = (*long_val << 4) | hex_val;
0.00 : 369 p++;
0.00 : 370 }
0.00 : 371
0.00 : 372 return p - ptr;
0.00 : 373 }

And I added many perf developers into Cc: because I want to listen to your
opinions about this new feature, if you don't mind.

If you give some feedback, I'd appreciate it! :)

Thanks Taeung,

I requested this feature some time ago and it's really cool to see someone
step up and implement it - much appreciated!

I just tested it out on my pet-example that leverages C++ instead of C:

~~~~~
#include <complex>
#include <cmath>
#include <random>
#include <iostream>

using namespace std;

int main()
{
uniform_real_distribution<double> uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 10000000; ++i) {
s += norm(complex<double>(uniform(engine), uniform(engine)));
}
cout << s << '\n';
return 0;
}
~~~~~

Compile it with:

g++ -O2 -g -std=c++11 test.cpp -o test

Then record it with perf:

perf record --call-graph dwarf ./test

Then analyse it with `perf report`. You'll see one entry for main with
something like:

+ 100.00% 39.69% cpp-inlining cpp-inlining [.] main

Select it and annotate it, then switch to your new source-only view:

main test.cpp
â 30 > â 31 using namespace std; > â 32 > â 33 int main() > â+ 34 { > â 35 uniform_real_distribution<double> uniform(-1E5, 1E5); > â 36 default_random_engine engine; > â+ 37 double s = 0; > â+ 38 for (int i = 0; i < 10000000; ++i) { > 4.88 â+ 39 s += norm(complex<double>(uniform(engine), uniform(engine))); > â 40 } > â 41 cout << s << '\n'; > â 42 return 0; > â+ 43 }

Note: the line numbers are off b/c my file contains a file-header on-top.
Ignore that.

Note2: There is no column header shown, so it's unclear what the first column
represents.

Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate
shows 4.88... What is that?

What this shows, is that it's extremely important to visualize inclusive cost
_and_ self cost in this view. Additionally, we need to account for inlining.
Right now, we only see the self cost that is directly within main, I suspect.

Currently perf annotate doesn't use the sample period, it uses sample
count instead and print the percentage within the function. So it's a
different number to the perf report. I think we need to fix this
first.

Thanks,
Namhyung


I understood. Hum.. so we need to replace the menu and column about total period to things about sample count like below ?

"t Toggle total period view"

-> "t Toggle showing the number of samples for"

(I'm not sure what a short key(e.g. 't') is proper..)

Or modifying the code related to the number of samples,
show actual total period on perf-annotate ?

What do you think about this change ?

Thanks,
Taeung


For C++ this is usually very misleading, and basically makes the annotate view
completely useless for application-level profiling. If a second column would
be added with the inclusive cost with the ability to drill down, then I could
easily see myself using this view.

I would appreciate if you could take this into account.

Thanks a lot


--
Milian Wolff | milian.wolff@xxxxxxxx | Senior Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts