Re: [PATCH 3/5] perf record: Tracking side-band events for all CPUs when tracing selected CPUs

From: Yang Jihong
Date: Wed Jul 12 2023 - 10:44:42 EST


Hello,

On 2023/7/11 21:13, Adrian Hunter wrote:
On 4/07/23 10:42, Yang Jihong wrote:
User space tasks can migrate between CPUs, we need to track side-band
events for all CPUs.

The specific scenarios are as follows:

CPU0 CPU1
perf record -C 0 start
taskA starts to be created and executed
-> PERF_RECORD_COMM and PERF_RECORD_MMAP
events only deliver to CPU1
......
|
migrate to CPU0
|
Running on CPU0 <----------/
...

perf record -C 0 stop

Now perf samples the PC of taskA. However, perf does not record the
PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA.
Therefore, the comm and symbols of taskA cannot be parsed.

The sys_perf_event_open invoked is as follows:

# perf --debug verbose=3 record -e cpu-clock -C 1 true
<SNIP>
Opening: cpu-clock
------------------------------------------------------------
perf_event_attr:
type 1
size 136
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID|LOST
disabled 1
inherit 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
Opening: dummy:HG
------------------------------------------------------------
perf_event_attr:
type 1
size 136
config 0x9
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID|LOST
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14
<SNIP>

Signed-off-by: Yang Jihong <yangjihong1@xxxxxxxxxx>
---
tools/perf/builtin-record.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8872cd037f2c..69e0d8c75aab 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -908,6 +908,31 @@ static int record__config_off_cpu(struct record *rec)
return off_cpu_prepare(rec->evlist, &rec->opts.target, &rec->opts);
}
+static int record__config_tracking_events(struct record *rec)
+{
+ struct evsel *evsel;
+ struct evlist *evlist = rec->evlist;
+ struct record_opts *opts = &rec->opts;
+
+ /*
+ * User space tasks can migrate between CPUs, so when tracing
+ * selected CPUs, sideband for all CPUs is still needed.
+ */
+ if (opts->target.cpu_list) {

I am not sure if anyone minds doing this by default, but perhaps
we should say something about it on the perf record man page.

Okay, will add comments to the man page.

+ evsel = evlist__findnew_tracking_event(evlist);
+ if (!evsel)
+ return -ENOMEM;
+
+ if (!evsel->core.system_wide) {
+ evsel->core.system_wide = true;
+ evsel__set_sample_bit(evsel, TIME);
+ perf_evlist__propagate_maps(&evlist->core, &evsel->core);
+ }

Perhaps better to export via internel/evsel.h

void perf_evsel__go_system_wide(struct perf_evlist *evlist, struct perf_evsel *evsel)
{
if (!evsel->system_wide) {
evsel->system_wide = true;
if (evlist->needs_map_propagation)
__perf_evlist__propagate_maps(evlist, evsel);
}
}

As suggested in response to patch 2, perhaps deal with system_wide
inside evlist__findnew_tracking_event()

Okay, I'll modify it as above, so maybe we need to export perf_evlist__propagate_maps().

As mentioned in the patch 1, __perf_evlist__propagate_maps is low-level and avoid to export it.
Or can we export perf_evsel__go_system_wide() via through internel/evlist.h?
In this way, we do not need to export perf_evlist__propagate_maps().
If so, would it be more appropriate to call perf_evlist__go_system_wide()?

Thanks,
Yang