Re: [PATCH V3 0/3] riscv: Add perf callchain support

From: Mao Han
Date: Mon Aug 19 2019 - 04:18:09 EST


Hi Paul,
On Fri, Aug 16, 2019 at 10:14:01AM -0700, Paul Walmsley wrote:
> Hello Mao Han,
>
> On Fri, 17 May 2019, Mao Han wrote:
>
> > This patch set add perf callchain(FP/DWARF) support for RISC-V.
> > It comes from the csky version callchain support with some
> > slight modifications. The patchset base on Linux 5.1.
> >
> > CC: Palmer Dabbelt <palmer@xxxxxxxxxx>
> > CC: linux-riscv <linux-riscv@xxxxxxxxxxxxxxxxxxx>
> > CC: Christoph Hellwig <hch@xxxxxx>
> > CC: Guo Ren <guoren@xxxxxxxxxx>
>
> I tried these patches on v5.3-rc4, both on the HiFive Unleashed board
> with a Debian-based rootfs and QEMU rv64 with a Fedora-based rootfs. For
> QEMU, I used defconfig, and for the HiFive Unleashed, I added a few more
> Kconfig directives; and on both, I enabled CONFIG_PERF_EVENTS. I built
> the perf tools from the kernel tree.
>
> Upon running "/root/bin/perf record -e cpu-clock --call-graph fp
> /bin/ls", I see the backtraces below. The first is on the HiFive
> Unleashed, the second is on QEMU.
>
> Could you take a look and tell me if you see similar issues? And if not,
> could you please walk me through your process for testing these patches on
> rv64, so I can reproduce it here?
>

I'v tried the command line above and got similar issues with probability.
unwind_frame_kernel can not stop unwind when fp is a quite large
value(like 0x70aac93ff0eff584) which can pass the simple stack check.
if (kstack_end((void *)frame->fp))
return -EPERM;
if (frame->fp & 0x3 || frame->fp < TASK_SIZE)
return -EPERM;
handle_exception from arch/riscv/kernel/entry.S will use s0(fp) as temp
register. The context for this frame is unpredictable. We may add more
strict check in unwind_frame_kernel or keep s0 always 0 in handle_exception
to fix this issue.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
at arch/riscv/kernel/perf_callchain.c:14
14 {
1: /x *frame = {fp = 0xffffffe000005ee0, ra = 0xffffffe000124c90}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
at arch/riscv/kernel/perf_callchain.c:14
14 {
1: /x *frame = {fp = 0xffffffe000124c74, ra = 0xffffffe000036cba}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
at arch/riscv/kernel/perf_callchain.c:14
14 {
1: /x *frame = {fp = 0x70aac57ff0eff584, ra = 0x8082614d64ea740a}
(gdb)

Some time perf record can output perf.data correct. something like:
# perf record -e cpu-clock --call-graph fp ls

perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (102 samples) ]
#
# perf report
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 102 of event 'cpu-clock'
# Event count (approx.): 25500000
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. .................................
#
30.39% 0.00% ls [kernel.kallsyms] [k] ret_from_exception
|
---ret_from_exception
|
|--16.67%--do_page_fault
| handle_mm_fault
| __handle_mm_fault
| |
| |--9.80%--filemap_map_pages
| | |
| | |--3.92%--alloc_set_pte
| | | page_add_file_rmap
| | |
| | --3.92%--filemap_map_pages
| |
| --0.98%--__do_fault
|
|--8.82%--schedule
| __sched_text_start
| finish_task_switch
|
--4.90%--__kprobes_text_start
irq_exit
irq_exit

28.43% 0.00% ls [kernel.kallsyms] [k] ret_from_syscall
|
---ret_from_syscall
|
|--15.69%--__se_sys_execve
| __do_execve_file
| search_binary_handler.part.7

I previous tested these patch with a program(callchain_test.c) mostly
running in userspace, so I didn't got this issues.

Thanks,
Mao Han
#include <stdio.h>

void test_4(void)
{
volatile int i, j;

for(i = 0; i < 10000000; i++)
j=i;
}

void test_3(void)
{
volatile int i, j;
test_4();
for(i = 0; i < 3000; i++)
j=i;
}

void test_2(void)
{
volatile int i, j;
test_3();
for(i = 0; i < 3000; i++)
j=i;
}

void test_1(void)
{
volatile int i, j;
test_2();
for(i = 0; i < 3000; i++)
j=i;

}

int main(void)
{
test_1();
return 0;
}