Re: [PATCH 2/3] perf tools: Add callchain order support for libunwind DWARF unwinder

From: Wangnan (F)
Date: Wed Nov 18 2015 - 02:31:39 EST




On 2015/11/18 13:41, Namhyung Kim wrote:
On Wed, Nov 18, 2015 at 12:13:08PM +0800, Wangnan (F) wrote:

On 2015/11/17 23:05, Jiri Olsa wrote:
From: Jiri Olsa <jolsa@xxxxxxxxxx>

As reported by Milian, currently for DWARF unwind (both libdw
and libunwind) we display callchain in callee order only.

Adding the support to follow callchain order setup to libunwind
DWARF unwinder, so we could get following output for report:

$ perf record --call-graph dwarf ls
...
$ perf report --no-children --stdio

39.26% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start
0

$ perf report -g caller --no-children --stdio
...
39.26% ls libc-2.21.so [.] __strcoll_l
|
---0
_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l

Reported-by: Milian Wolff <milian.wolff@xxxxxxxx>
Based-on-patch-by: Milian Wolff <milian.wolff@xxxxxxxx>
Link: http://lkml.kernel.org/n/tip-lmtbeqm403f3luw4jkjevsi5@xxxxxxxxxxxxxx
Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
tools/perf/util/unwind-libunwind.c | 47 ++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 0ae8844fe7a6..705e1c19f1ea 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
[SNIP]

- unw_get_reg(&c, UNW_REG_IP, &ip);
- ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
In original code if ip == 0 entry() won't be called.

+ if (callchain_param.order == ORDER_CALLER)
+ j = max_stack - i - 1;
+ ret = entry(ips[j], ui->thread, cb, arg);
But in new code event if ips[j] == 0 an entry will be built, which causes
a behavior changes user noticable:

Before this patch:


# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
--2.70%-- funcb
funca
main
__libc_start_main
_start

After this patch:

# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
|--2.70%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.68%-- 0


I'm not sure whether we can regard this behavior changing as a bugfix? I
think
there may be some reason the original code explicitly avoid creating an '0'
entry.
I think callchain value being 0 is an error or marker for the end of
callchain. So it'd be better avoiding 0 entry.

But unfortunately, we have many 0 entries (and broken callchain after
them) with fp recording on optimized binaries. I think we should omit
those callchains.

Maybe something like this?


diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5ef90be2a249..22642c5719ab 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1850,6 +1850,15 @@ static int thread__resolve_callchain_sample(struct thread *thread,
#endif
ip = chain->ips[j];
+ /* callchain value inside zero page means it's broken, stop */
+ if (ip < 4096) {
+ if (callchain_param.order == ORDER_CALLER) {
+ callchain_cursor_reset(&callchain_cursor);
+ continue;
+ } else
+ break;
+ }
+
err = add_callchain_ip(thread, parent, root_al, &cpumode, ip);
if (err)

Then we totally get rid of 0 entries, but how can we explain
the sum of overhead of different branches?

Is it possible to explicitly tell user the place where perf
failed to unwind call stack? For example:

3.38% a.out a.out [.] funcc
|
---funcc
|
|--2.70%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.68%-- (unwind failure)


Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/