Re: [PATCH 2/3] perf tools: Add callchain order support for libunwind DWARF unwinder

From: Wangnan (F)
Date: Tue Nov 17 2015 - 23:13:42 EST




On 2015/11/17 23:05, Jiri Olsa wrote:
From: Jiri Olsa <jolsa@xxxxxxxxxx>

As reported by Milian, currently for DWARF unwind (both libdw
and libunwind) we display callchain in callee order only.

Adding the support to follow callchain order setup to libunwind
DWARF unwinder, so we could get following output for report:

$ perf record --call-graph dwarf ls
...
$ perf report --no-children --stdio

39.26% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start
0

$ perf report -g caller --no-children --stdio
...
39.26% ls libc-2.21.so [.] __strcoll_l
|
---0
_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l

Reported-by: Milian Wolff <milian.wolff@xxxxxxxx>
Based-on-patch-by: Milian Wolff <milian.wolff@xxxxxxxx>
Link: http://lkml.kernel.org/n/tip-lmtbeqm403f3luw4jkjevsi5@xxxxxxxxxxxxxx
Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
tools/perf/util/unwind-libunwind.c | 47 ++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 0ae8844fe7a6..705e1c19f1ea 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c

[SNIP]

- unw_get_reg(&c, UNW_REG_IP, &ip);
- ret = ip ? entry(ip, ui->thread, cb, arg) : 0;

In original code if ip == 0 entry() won't be called.

+ if (callchain_param.order == ORDER_CALLER)
+ j = max_stack - i - 1;
+ ret = entry(ips[j], ui->thread, cb, arg);

But in new code event if ips[j] == 0 an entry will be built, which causes
a behavior changes user noticable:

Before this patch:


# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
--2.70%-- funcb
funca
main
__libc_start_main
_start

After this patch:

# perf report --no-children --stdio --call-graph=callee
...
3.38% a.out a.out [.] funcc
|
---funcc
|
|--2.70%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.68%-- 0


I'm not sure whether we can regard this behavior changing as a bugfix? I think
there may be some reason the original code explicitly avoid creating an '0'
entry.

Then I tried to find why perf can't get call frame on my case, and
I guess there's something wrong whe dealing with 'call' command, because
the instruction on it I can't get callchain from libunwind is a 'callq':

...
4005bf: be 00 00 00 00 mov $0x0,%esi
4005c4: 48 89 c7 mov %rax,%rdi
4005c7: e8 74 fe ff ff callq 400440 <gettimeofday@plt>
us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
4005cc: 48 8b 04 24 mov (%rsp),%rax
...

But this is another problem, we can discuss it in a new thread.

Thank you.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/