Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events

From: Adrian Hunter
Date: Tue Apr 18 2023 - 09:36:53 EST


On 18/04/23 09:18, Adrian Hunter wrote:
> On 17/04/23 14:02, Peter Zijlstra wrote:
>> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>>> Hi
>>>
>>> Here is a stab at adding an ioctl for sideband events.
>>>
>>> This is to overcome races when reading the same information
>>> from /proc.
>>
>> What races? Are you talking about reading old state in /proc the kernel
>> delivering a sideband event for the new state, and then you writing the
>> old state out?
>>
>> Surely that's something perf tool can fix without kernel changes?
>
> Yes, and it was a bit of a brain fart not to realise that.
>
> There may still be corner cases, where different kinds of events are
> interdependent, perhaps NAMESPACES events vs MMAP events could
> have ordering issues.
>
> Putting that aside, the ioctl may be quicker than reading from
> /proc. I could get some numbers and see what people think.
>

Here's a result with a quick hack to use the ioctl but without
handling the buffer becoming full (hence the -m4M)

# ps -e | wc -l
1171
# perf.old stat -- perf.old record -o old.data --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB old.data (100 samples) ]

Performance counter stats for 'perf.old record -o old.data --namespaces -a true':

498.15 msec task-clock # 0.987 CPUs utilized
126 context-switches # 252.935 /sec
64 cpu-migrations # 128.475 /sec
4396 page-faults # 8.825 K/sec
1927096347 cycles # 3.868 GHz
4563059399 instructions # 2.37 insn per cycle
914232559 branches # 1.835 G/sec
6618052 branch-misses # 0.72% of all branches
9633787105 slots # 19.339 G/sec
4394300990 topdown-retiring # 38.8% Retiring
3693815286 topdown-bad-spec # 32.6% Bad Speculation
1692356927 topdown-fe-bound # 14.9% Frontend Bound
1544151518 topdown-be-bound # 13.6% Backend Bound

0.504636742 seconds time elapsed

0.158237000 seconds user
0.340625000 seconds sys

# perf.old stat -- perf.new record -o new.data -m4M --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB new.data (103 samples) ]

Performance counter stats for 'perf.new record -o new.data -m4M --namespaces -a true':

386.61 msec task-clock # 0.988 CPUs utilized
100 context-switches # 258.658 /sec
65 cpu-migrations # 168.128 /sec
4935 page-faults # 12.765 K/sec
1495905137 cycles # 3.869 GHz
3647660473 instructions # 2.44 insn per cycle
735822370 branches # 1.903 G/sec
5765668 branch-misses # 0.78% of all branches
7477722620 slots # 19.342 G/sec
3415835954 topdown-retiring # 39.5% Retiring
2748625759 topdown-bad-spec # 31.8% Bad Speculation
1221594670 topdown-fe-bound # 14.1% Frontend Bound
1256150733 topdown-be-bound # 14.5% Backend Bound

0.391472763 seconds time elapsed

0.141207000 seconds user
0.246277000 seconds sys

# ls -lh old.data
-rw------- 1 root root 1.2M Apr 18 13:19 old.data
# ls -lh new.data
-rw------- 1 root root 1.2M Apr 18 13:19 new.data
#