Re: [PATCH 0/6] perf core: Read from overwrite ring buffer

From: Wangnan (F)
Date: Thu Jan 21 2016 - 23:46:58 EST




On 2016/1/22 11:21, Alexei Starovoitov wrote:
On Fri, Jan 22, 2016 at 10:21:19AM +0800, Wangnan (F) wrote:

On 2016/1/21 14:51, Wangnan (F) wrote:

On 2016/1/20 10:20, Alexei Starovoitov wrote:
On Wed, Jan 20, 2016 at 09:37:42AM +0800, Wangnan (F) wrote:
On 2016/1/20 1:42, Alexei Starovoitov wrote:
On Tue, Jan 19, 2016 at 11:16:44AM +0000, Wang Nan wrote:
This patchset introduces two methods to support reading from
overwrite.

1) Tailsize: write the size of an event at the end of it
2) Backward writing: write the ring buffer from the end of it to
the
beginning.
what happend with your other idea of moving the whole header to the
end?
That felt better than either of these options.
I'll try it today. However, putting all of the three together is
not as easy as this patchset.
I'm missing something. Why all three in one set?
Can't implement all three in one, but implement two of them make
benchmarking simpler :)

Here comes some numbers.

I attach a target program at the end of this mail. It calls
close(-1) for 3000000 times, and use gettimeofday to check
how many us it takes.

Following cases are tested:


BASE : ./a.out
RAWPERF : ./perf record -o /dev/null -e raw_syscalls:* ./a.out
WRTBKWRD: ./perf record -o /dev/null -e raw_syscalls:* ./a.out
TAILSIZE: ./perf record --no-has-write-backward -o /dev/null -e
raw_syscalls:*/overwrite/ ./a.out
RAWOVWRT: ./perf record --no-has-write-backward --no-has-tailsize -o
/dev/null -e raw_syscalls:*/overwrite/ ./a.out

With this script:

func() {
for x in `seq 1 100` ; do $1; done | tee data_$2
}

func ./a.out base
func "./perf record -o /dev/null -e raw_syscalls:* ./a.out" rawperf
func "./perf record -o /dev/null -e raw_syscalls:*/overwrite/ ./a.out"
wrtbkwrd
func "./perf record -o /dev/null --no-has-write-backward -e
raw_syscalls:*/overwrite/ ./a.out" tailsize
func "./perf record -o /dev/null --no-has-write-backward --no-has-tailsize
-o /dev/null -e raw_syscalls:*/overwrite/ ./a.out" rawovwrt

Result:

MEAN STDVAR
BASE : 879870.81 11913.13
RAWPERF : 2603854.7 706658.4
WRTBKWRD: 2313301.220 6727.957
TAILSIZE: 2383051.860 5248.061
RAWOVWRT: 2315273.180 5221.025
Add a number: I tested original perf overwrite ring buffer in pure v4.4
on the same machine:

MEAN STDVAR
RAWOVWRT(original): 2323970.45 5103.39

So I think backward writing method doesn't add extra overhead into
fastpath.

I will send this patchset again with several bugs fixed. After that
I'll start working on tail-header if it is still required.
interesting.
did I read the numbers correctly that 'write backwards' method
is actually the fastest? even faster than no-overwrite?

Yes. But notice STDVAR, we can't say 'WRTBKWRD' outperform 'RAWOVWRT'. However,
at least 'WRTBKWRD' should be as fast as 'RAWOVWRT'.

nice. I guess it makes snese that overwrite is faster.

In no-overwrite case perf itself wakes up many times to collect data,
I guess it is the source of high stdvar.

I guess than moving the header to the end will have the same
performance in this benchmark, since RAWOVWRT is the same as well.

Yes.

Do you want to test it by yourself? The code is ready.

Thank you.