perf: wrong event->count report (Was: perf basic-test-aarch64 failures)

From: Oleg Nesterov
Date: Tue Feb 16 2016 - 22:34:24 EST

Next message: Ross Zwisler: "[PATCH v3 0/6] DAX fixes, move flushing calls to FS"
Previous message: xuyiping: "Re: [PATCH] zsmalloc: drop unused member 'mapping_area->huge'"
Next in thread: Jiri Olsa: "Re: perf: wrong event->count report (Was: perf basic-test-aarch64 failures)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Finally I reproduced... let me add CC's and reply to initial message. This has
nothing to do with arm/uprobes.

I simply can't understand how perf calculates ->total_time_enabled/running.
At all. But the problem is that

1. perf_event_enable_on_exec() does enable first, then event_sched_in().

After that tstamp_enabled < tstamp_running

2. This means that after the next update_event_times()
total_time_running < total_time_enabled

again, I fail to understand these calculations, but this is what
perf_event_read_value() reports to user-space.

3. /usr/bin/perf calls perf_counts_values__scale() which does

else if (count->run < count->ena) {
scaled = 1;
count->val = (u64)((double) count->val * count->ena / count->run + 0.5);
}

and this is why you see the wrong number. count->val was correct but
wrongly updated because total_time_running < total_time_enabled.

I leave this to Peter and Jiri ;)

Oleg.

On 02/15, Pratyush Anand wrote:
>
> Hi Oleg/Jiri,
>
> Latest Linux upstream code is broken for a test case related to uprobe. I have
> tested ARM64 as well as x86 platform and both of them have been broken.
>
> Test case instruments uprobe at each function f_* in the following code and
> counts number of probe callback called for them.
> https://github.com/rfmvh/perftool-testsuite/blob/master/base_probe/examples/exact_counts.c
>
> Test case is:
>
> # add probes
> for i in 1 2 3 103 997 65535; do
> perf probe -x examples/exact_counts --add f_${i}x
> done
>
> #perf_probe :: test_exact_counts :: using probes :: perf stat
>
> perf stat -x';' -e 'probe_exact:*' examples/exact_counts
>
> And we get something like following on x86.
>
> 65594;;probe_exact:f_65535x;216738260;99.91
> 998;;probe_exact:f_997x;216738260;99.91
> 103;;probe_exact:f_103x;216738260;99.91
> 3;;probe_exact:f_3x;216738260;99.91
> 2;;probe_exact:f_2x;216738260;99.91
> 1;;probe_exact:f_1x;216738260;99.91
>
> So, here you can see that f_65535x count is 65594, while it was expected as
> 65535.
>
> Some observation which might be helpful to resolve it:
>
> - if we use -a with perf stat, then count is correct.
> perf stat -x';' -e 'probe_exact:*' -a examples/exact_counts
> - When count is wrong, I noticed that "run" and "ena" values for count has huge
> difference, which causes count to be scaled up by perf tool.
> - Following upstream commit is the first bad commit which is causing this issue.
> # first bad commit: [3e349507d12de93b08b0aa814fc2aa0dee91c5ba] perf: Fix perf_enable_on_exec() event scheduling
>
> ~Pratyush
>
> On 11/02/2016:02:37:56 PM, Michael Petlan wrote:
> > On Thu, 2016-02-11 at 15:00 +0530, Pratyush Anand wrote:
> > > Hi Michael,
> > >
> >
> > Hi.
> >
> > > Can you please replace perf tools with following and try to see if result
> > > improves (I expect test 1 and 5 should pass now).
> > >
> > > http://download.devel.redhat.com/brewroot/work/tasks/9732/10469732/perf-4.5.0-0.rc2.26.el7.test.aarch64.rpm
> >
> > Yes, this perf passes the first and fifth testcase. The rest is the
> > same.
> >
> > >
> > > I will need to look into the test instructions for failing test 2,3,4 (described
> > > in your mail). I have kernel test package
> > > (git://pkgs.devel.redhat.com/tests/kernel) cloned on my system, but I do not see
> > > directory "perf/Sanity/basic-test-aarch64" in that test package. So can you
> > > please share correct path for kernel-aarch64 test package repository.
> > >
> >
> > This test is in a separate perf repository:
> >
> > git://pkgs.devel.redhat.com/tests/perf
> >
> > It is just a beaker wrapper for the testsuite, which is upstream here:
> >
> > https://github.com/rfmvh/perftool-testsuite
> >
> > Basically, the aarch64 version of the beaker test bundle contains
> > the upstream version. The non-aarch64 test has some changes and hacks in
> > order to run on RHEL.
> >
> > > ~Pratyush
> > >
> >
> > In order to reproduce the failures, do the following:
> >
> > cd base_probe
> > export CMD_PERF=/path/to/your/perf # not necessary for default
> > export TESTMODE_QUIET=n
> > ./setup.sh
> > ./test_exact_counts.sh
> > ./test_advanced.sh
> >
> > After running the setup.sh script, you may reproduce it manually too:
> >
> > (2) perf_probe :: test_exact_counts :: using probes :: perf report
> >
> > # add probes
> > for i in 1 2 3 103 997 65535; do
> > perf probe -x examples/exact_counts --add f_${i}x
> > done
> >
> > # record
> > perf record -e 'probe_exact:*' -- examples/exact_counts
> > # there should be 66641 samples
> >
> > # report
> > perf report -n --stdio
> >
> > ###################################################################
> >
> > (3) perf_probe :: test_exact_counts :: using probes :: perf stat
> >
> > After running setup.sh, run:
> >
> > perf stat -x';' -e 'probe_exact:*' examples/exact_counts
> >
> > ## The number of event occurrences should be the same as the names
> > ## of the functions.
> >
> > ###################################################################
> >
> >
> > (4) perf_probe :: test_advanced :: function argument probing :: script
> >
> > After running setup.sh, run:
> >
> > perf probe -x examples/advanced --add 'isprime a'
> > perf record -e 'probe_advanced:isprime' examples/advanced
> > perf script
> >
> >
> >
> >
> > Michael

Next message: Ross Zwisler: "[PATCH v3 0/6] DAX fixes, move flushing calls to FS"
Previous message: xuyiping: "Re: [PATCH] zsmalloc: drop unused member 'mapping_area->huge'"
Next in thread: Jiri Olsa: "Re: perf: wrong event->count report (Was: perf basic-test-aarch64 failures)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]