Re: [PATCH v2 4/7] perf, x86: large PEBS interrupt threshold

From: Peter Zijlstra
Date: Tue Jul 15 2014 - 06:41:58 EST


On Tue, Jul 15, 2014 at 04:58:56PM +0800, Yan, Zheng wrote:
> PEBS always had the capability to log samples to its buffers without
> an interrupt. Traditionally perf has not used this but always set the
> PEBS threshold to one.
>
> For frequently occuring events (like cycles or branches or load/stores)
> this in term requires using a relatively high sampling period to avoid
> overloading the system, by only processing PMIs. This in term increases
> sampling error.
>
> For the common cases we still need to use the PMI because the PEBS
> hardware has various limitations. The biggest one is that it can not
> supply a callgraph. It also requires setting a fixed period, as the
> hardware does not support adaptive period. Another issue is that it
> cannot supply a time stamp and some other options. To supply a TID it
> requires flushing on context switch. It can however supply the IP, the
> load/store address, TSX information, registers, and some other things.
>
> So we can make PEBS work for some specific cases, basically as long as
> you can do without a callgraph and can set the period you can use this
> new PEBS mode.
>
> The main benefit is the ability to support much lower sampling period
> (down to -c 1000) without extensive overhead.
>
> One use cases is for example to increase the resolution of the c2c tool.
> Another is double checking when you suspect the standard sampling has
> too much sampling error.
>
> Some numbers on the overhead, using cycle soak, comparing
> "perf record --no-time -e cycles:p -c" to "perf record -e cycles:p -c"
>
> period plain multi delta
> 10003 15 5 10
> 20003 15.7 4 11.7
> 40003 8.7 2.5 6.2
> 80003 4.1 1.4 2.7
> 100003 3.6 1.2 2.4
> 800003 4.4 1.4 3
> 1000003 0.6 0.4 0.2
> 2000003 0.4 0.3 0.1
> 4000003 0.3 0.2 0.1
> 10000003 0.3 0.2 0.1
>
> The interesting part is the delta between multi-pebs and normal pebs. Above
> -c 1000003 it does not really matter because the basic overhead is so low.
> With periods below 80003 it becomes interesting.
>
> Note in some other workloads (e.g. kernbench) the smaller sampling periods
> cause much more overhead without multi-pebs, upto 80% (and throttling) have
> been observed with -c 10003. multi pebs generally does not throttle.
>

And not a single word on the multiplex horror we talked about. That
should be mentioned, in detail.

Attachment: pgpZ0Bn9HNs73.pgp
Description: PGP signature