[PATCH v3 0/2] perf/sdt: Directly record SDT events with 'perf record'

From: Ravi Bangoria
Date: Fri Feb 24 2017 - 02:45:46 EST


All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe point for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.

As suggested by Ingo[1], it's better to make this process simple by
creating probe point automatically with 'perf record' for SDT events.
Same effort was done by Hemant some time ago[2]. This patch series
is based on work he has done.

Changes in v3:
- Rebased to current acme/perf/core

- v2 was having build failure if LIB_ELF support is missing. Solved
it in this series.

- Event starting with 'sdt_' will be considered as SDT event. v2
was forcing to prepend SDT event name with '%' for 'perf record'
but not for 'perf probe'. This change was suggested by Brendan/
Masami in one of the review comment in v1.[3]

- There was one more problem with v2. If internal 'perf probe' fails
to lookup SDT event in cache because of too big list (-E2BIG),
'perf record' starts recording cycles event instead of showing
warning/error. This can totally confuse user. I've resolved it. ex:

With v2:
$ perf record -a -e sdt_qemu:*
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.246 MB perf.data (1523 samples) ]

$ perf evlist
cpu-clock

With this patch:
$ perf record -a -e sdt_qemu:*
event syntax error: 'sdt_qemu:*'
\___ Cache lookup failed

- This patch series still allows user to place a probe manually with
'perf probe' but also shows a hint:

$ perf probe sdt_libpthread:mutex_entry
...
Hint: SDT event can be directly recorded with 'perf record'. No need to create probe manually.

- v2 was always trying to look for sdt event in probe cache by ignoring
entries of uprobe_events. Hence, it creates new probe points for event
even if it already exists. This was confusing from user's point of
view.

With v2:
$ perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

$ perf record -a -e %sdt_libpthread:mutex_entry
Warning : Recording on 2 occurrences of sdt_libpthread:mutex_entry

$ perf evlist
sdt_libpthread:mutex_entry_3
sdt_libpthread:mutex_entry_2

There are two issues:
1st, It confuses user about new names.
2nd, (and more important) Perf won't allow you to record
'sdt_libpthread:mutex_entry_1' with 'perf record' even if it
exists in uprobe_events, because it won't find event with that
name in probe cache.

I've solved these issues. This patch gives first priority to existing
entries of uprobe_events file. After that, it looks into probe cache.
For ex,

With this patch:
$ perf probe sdt_libpthread:mutex_entry
Added new events:
sdt_libpthread:mutex_entry (on %mutex_entry in /usr/lib64/libpthread-2.24.so)
sdt_libpthread:mutex_entry_1 (on %mutex_entry in /usr/lib64/libpthread-2.24.so)

$ perf record -a -e sdt_libpthread:mutex_entry_1
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

It also lists matching entries as 'name addr@filename' followed by the
hint about how to delete them with 'perf probe -d'.

If there is no matching entry found in uprobe_events with that *name*,
perf will find all events with that name from probe cache. Once again
it checks whether these events exists in uprobe_events, but this time
it uses *address* and *filename* instead of eventname. If entry found,
it reuse that entry instead of creating new one. In continuation of
above example,

With this patch:
$ perf probe -d sdt_libpthread:mutex_entry
Removed event: sdt_libpthread:mutex_entry

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

$ perf record -a -e sdt_libpthread:*
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

Warning: Recording on 35 occurrences of sdt_libpthread:*

$ perf evlist
sdt_libpthread:mutex_entry
sdt_libpthread:pthread_create
sdt_libpthread:mutex_entry_1
...

Here, Perf has reused entry for event with 0xbcbb address, but also it
has created new entry for event with 0x9ddb address. It also maintains
list of entries created for particular record session, and uses that
list to remove entries at the end of session.

Finally, If somehow tool fails to clean events from uprobe_events at
the end of session, user has to clean events manually with
'perf probe -d'. But perf will give Warning in such case. For ex,

$ perf record -a -e sdt_libpthread:mutex_entry
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_entry
/** Fails with segfault **/

$ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry /usr/lib64/libpthread-2.24.so:0x0000000000009ddb
p:sdt_libpthread/mutex_entry_1 /usr/lib64/libpthread-2.24.so:0x000000000000bcbb

When next time user tries to record, it will show a warning:

$ perf record -a -e sdt_libpthread:mutex_entry
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry 0x9ddb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).

Warning: Found 2 events from probe-cache with name 'sdt_libpthread:mutex_entry'.
Since probe point already exists with this name, recording only 1 event.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry*' to allow record on all events.

But no such warning for 'sdt_libpthread:mutex_entry_1'.

$ perf record -a -e sdt_libpthread:mutex_entry_1
Matching event(s) from uprobe_events:
sdt_libpthread:mutex_entry_1 0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d <event>' to delete event(s).


[1] https://lkml.org/lkml/2017/2/7/59
[2] https://lkml.org/lkml/2016/5/3/810
[3] https://lkml.org/lkml/2016/5/2/689


Hemant Kumar (1):
perf/sdt: Directly record SDT events with 'perf record'

Ravi Bangoria (1):
perf/sdt: Introduce util func is_sdt_event()

tools/lib/api/fs/tracing_path.c | 17 +-
tools/perf/builtin-probe.c | 21 ++-
tools/perf/builtin-record.c | 23 +++
tools/perf/perf.h | 2 +
tools/perf/util/parse-events.c | 56 +++++-
tools/perf/util/parse-events.h | 2 +
tools/perf/util/probe-event.c | 53 +++++-
tools/perf/util/probe-event.h | 4 +
tools/perf/util/probe-file.c | 376 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/probe-file.h | 27 +++
tools/perf/util/util.c | 12 ++
tools/perf/util/util.h | 2 +
12 files changed, 567 insertions(+), 28 deletions(-)

--
2.9.3