[PATCH v2 0/3] perf kvm: Guest Symbol Resolution for powerpc

From: Ravi Bangoria
Date: Fri Jan 22 2016 - 00:59:26 EST


'perf kvm {record|report}' is used to record and report the profiled
performance of any workload on a guest. From the host, we can collect
guest kernel statistics which is useful in finding out any contentions
in guest kernel symbols for a certain workload.
This feature is not available on powerpc because 'perf' relies on the
'cycles' event (a PMU event) to profile the guest. However, for powerpc,
this can't be used from the host because the PMUs are controlled by the
guest rather than the host.

Due to this issue, we need a different approach to profile the
workload in the guest. There exists a tracepoint 'kvm_hv:kvm_guest_exit'
in powerpc which is hit whenever any of the threads exit the guest
context. The guest instruction pointer dumped along with this
tracepoint data in the field 'pc', can be used as guest instruction
pointer while postprocessing the trace data to map this IP to symbol
from guest.kallsyms.

However, to have some kind of periodicity, we can't use all the kvm
exits, rather exits which are bound to happen in certain intervals.
HV_DECREMENTER Interrupt forces the threads to exit after an interval
of 10 ms.

This patch makes use of the 'kvm_guest_exit' tracepoint and checks the
exit reason for any kvm exit. If it is HV_DECREMENTER, then the
instruction pointer dumped along with this tracepoint is retrieved and
mapped with the guest kallsyms. So for powerpc, 'perf kvm record' will
record 'kvm_hv:kvm_guest_exit' events instead of cycles.

This patch will enable --guest option for perf kvm {record|report} on
powerpc. Still --host --guest together won't work.

This patch can be considered as a next iteration to RFC patch sent by
Hemant Kumar: https://lkml.org/lkml/2015/6/15/670. Hemant's patch is used
for enabling 'perf kvm report', while I've added code to enable
'perf kvm record' on powerpc.

Patches are developed on acme's perf/core branch.

* changes in v2)
- Patch 1,2 are breakdown of v1 patch with little changes
- Patch 3 is new. It fixes output format of perf kvm report

Before applying patch:
[Note: one needs to run vm with kvm enabled]

$ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules record -a
[ perf record: Captured and wrote 1.530 MB perf.data.guest (28768 samples) ]

$ ./perf script -i perf.data.guest
qemu-system-ppc 9688 [000] 842566.451558: 1 cycles:ppp: c0000000001f2860 .mmap_region ([kernel.kallsyms])
qemu-system-ppc 9688 [000] 842566.451562: 1 cycles:ppp: c0000000000a2d68 .kvmppc_do_h_enter ([kernel.kallsyms])
qemu-system-ppc 9688 [000] 842566.451564: 7 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
qemu-system-ppc 9688 [000] 842566.451565: 138 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
qemu-system-ppc 9688 [000] 842566.451567: 3128 cycles:ppp: c0000000000097d8 ._switch ([kernel.kallsyms])
qemu-system-ppc 9688 [000] 842566.451570: 81568 cycles:ppp: c0000000000ea8bc .wake_up_new_task ([kernel.kallsyms])
swapper 0 [004] 842566.451580: 1 cycles:ppp: c0000000001f2d88 .sys_munmap ([kernel.kallsyms])
swapper 0 [004] 842566.451583: 1 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
swapper 0 [004] 842566.451584: 11 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
swapper 0 [004] 842566.451585: 226 cycles:ppp: c0000000000097d4 ._switch ([kernel.kallsyms])
swapper 0 [004] 842566.451586: 5664 cycles:ppp: c00000000000990c resume_kernel ([kernel.kallsyms])
swapper 0 [004] 842566.451591: 147929 cycles:ppp: c00000000010a4fc .freeze_set_ops ([kernel.kallsyms])
swapper 0 [008] 842566.451597: 1 cycles:ppp: c0000000001f2d98 .sys_munmap ([kernel.kallsyms])
swapper 0 [008] 842566.451600: 1 cycles:ppp: c0000000000a2ee0 .kvmppc_do_h_enter ([kernel.kallsyms])
swapper 0 [008] 842566.451602: 11 cycles:ppp: c0000000000a2ee0 .kvmppc_do_h_enter ([kernel.kallsyms])
swapper 0 [008] 842566.451603: 224 cycles:ppp: c00000000001f274 .vsx_unavailable_tm ([kernel.kallsyms])
swapper 0 [008] 842566.451604: 5240 cycles:ppp: c000000000009984 fast_exception_return ([kernel.kallsyms])
swapper 0 [008] 842566.451608: 134752 cycles:ppp: c000000000780af4 .inet_diag_handler_get_info ([kernel.kallsyms])
swapper 0 [012] 842566.451616: 1 cycles:ppp: c0000000001f2828 .mmap_region ([kernel.kallsyms])
swapper 0 [012] 842566.451619: 1 cycles:ppp: c0000000000a2d78 .kvmppc_do_h_enter ([kernel.kallsyms])
swapper 0 [012] 842566.451620: 11 cycles:ppp: c00000000001f26c .vsx_unavailable_tm ([kernel.kallsyms])
swapper 0 [012] 842566.451621: 226 cycles:ppp: c0000000000097d4 ._switch ([kernel.kallsyms])
swapper 0 [012] 842566.451623: 5549 cycles:ppp: c00000000000990c resume_kernel ([kernel.kallsyms])

$ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 28K of event 'cycles:ppp'
# Event count (approx.): 10473529515
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#

#
# (For a higher level overview, try: perf report --sort comm,dso)
#
$

After applying patch:

$ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules record -a
[ perf record: Captured and wrote 11.898 MB perf.data.guest (127299 samples) ]

$ ./perf script -i perf.data.guest
qemu-system-ppc 9690 [008] 857043.632783: kvm_hv:kvm_guest_exit: VCPU 12: trap=EXTERNAL pc=0xc00000000057a4f0 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.632858: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.632899: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091a14 msr=0x8000000000001032, ceded=0
qemu-system-ppc 9690 [008] 857043.632912: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.632923: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.632941: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.632977: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633012: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633033: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633053: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633077: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633097: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633109: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633120: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633599: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091b70 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633637: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091a14 msr=0x8000000000001032, ceded=0
qemu-system-ppc 9690 [008] 857043.633650: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0
qemu-system-ppc 9690 [008] 857043.633661: kvm_hv:kvm_guest_exit: VCPU 12: trap=SYSCALL pc=0xc000000000091924 msr=0x8000000000009032, ceded=0

$ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 127K of event 'kvm_hv:kvm_guest_exit'
# Event count (approx.): 127299
#
# Overhead Command Shared Object Symbol
# ........ ....... ....................... ................................................
#
0.02% :9690 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
0.01% :9689 [guest.kernel.kallsyms] [g] .n_tty_write
0.00% :9690 [guest.kernel.kallsyms] [g] .n_tty_write
0.00% :9690 [unknown] [u] 0x00003fff966eb690
0.00% :9688 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
0.00% :9689 [guest.kernel.kallsyms] [g] .plpar_hcall_norets
0.00% :9689 [unknown] [u] 0x00003fff866b8830
0.00% :9690 [unknown] [u] 0x00003fff966eb670
0.00% :9687 [guest.kernel.kallsyms] [g] .n_tty_write
0.00% :9689 [guest.kernel.kallsyms] [g] .__copy_tofrom_user_power7
0.00% :9689 [guest.kernel.kallsyms] [g] ._raw_spin_lock_irqsave
0.00% :9689 [guest.kernel.kallsyms] [g] .queue_work_on
0.00% :9690 [guest.kernel.kallsyms] [g] .flush_to_ldisc
0.00% :9690 [guest.kernel.kallsyms] [g] .plpar_hcall
0.00% :9690 [guest.kernel.kallsyms] [g] fast_exception_return
0.00% :9690 [unknown] [u] 0x00003fff966eb6a0
0.00% :9690 [unknown] [u] 0x00003fff966fd09c
0.00% :9687 [guest.kernel.kallsyms] [g] .__copy_tofrom_user_power7
0.00% :9688 [guest.kernel.kallsyms] [g] ._raw_spin_lock_irqsave
0.00% :9688 [guest.kernel.kallsyms] [g] .n_tty_write
0.00% :9688 [guest.kernel.kallsyms] [g] .plpar_hcall
0.00% :9689 [guest.kernel.kallsyms] [g] .__srcu_read_unlock
0.00% :9689 [guest.kernel.kallsyms] [g] ._raw_spin_lock
0.00% :9689 [guest.kernel.kallsyms] [g] .arch_local_irq_restore

Ravi Bangoria (3):
perf kvm: Introduce evsel as argument to perf_event__preprocess_sample
perf kvm: enable record|report feature on powerpc
perf kvm: Fix output fields instead of 'trace' for perf kvm report on
powerpc

tools/perf/arch/powerpc/util/Build | 1 +
tools/perf/arch/powerpc/util/kvm.c | 134 +++++++++++++++++++++++++++++++++++++
tools/perf/builtin-annotate.c | 3 +-
tools/perf/builtin-diff.c | 3 +-
tools/perf/builtin-kvm.c | 23 +++++--
tools/perf/builtin-mem.c | 10 +--
tools/perf/builtin-report.c | 3 +-
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-timechart.c | 8 ++-
tools/perf/builtin-top.c | 3 +-
tools/perf/builtin.h | 3 +
tools/perf/tests/hists_cumulate.c | 2 +-
tools/perf/tests/hists_filter.c | 2 +-
tools/perf/tests/hists_link.c | 4 +-
tools/perf/tests/hists_output.c | 2 +-
tools/perf/util/event.c | 15 ++++-
tools/perf/util/event.h | 3 +-
tools/perf/util/evlist.c | 9 +++
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 7 ++
tools/perf/util/evsel.h | 4 ++
tools/perf/util/session.c | 9 +--
tools/perf/util/util.c | 5 ++
tools/perf/util/util.h | 1 +
24 files changed, 227 insertions(+), 31 deletions(-)
create mode 100644 tools/perf/arch/powerpc/util/kvm.c

--
2.1.4