Re: perf top -p broken for multithreaded processes since 5.19

From: Jiri Olsa
Date: Mon Sep 05 2022 - 06:42:26 EST


On Sat, Sep 03, 2022 at 10:14:25AM +0300, Adrian Hunter wrote:
> On 2/09/22 22:17, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Sep 02, 2022 at 05:50:22PM +0300, Adrian Hunter escreveu:
> >> On 2/09/22 17:46, Tomáš Trnka wrote:
> >>> Hello,
> >>>
> >>> A bug in perf v5.19 and newer completely breaks monitoring multithreaded
> >>> processes using "perf top -p". The tool fails to start with "Failed to mmap
> >>> with 22 (Invalid argument)". It still seems to work fine on single-threaded
> >>> processes. "perf record" is also unaffected.
> >>
> >> It has been reported here:
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=216441
> >
> > If I do:
> >
> > ⬢[acme@toolbox perf-urgent]$ git log -2
> > commit dfeb0bc60782471c293938e71b1a1117cfac2cb3 (HEAD -> perf/urgent)
> > Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > Date: Fri Sep 2 16:15:39 2022 -0300
> >
> > Revert "libperf evlist: Check nr_mmaps is correct"
> >
> > This reverts commit 4ce47d842d4c16c07b135b8a7975b8f0672bcc0e.
> >
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> >
> > commit 78cd283f6b8ab701cb35eafd5af8140560a88f16
> > Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > Date: Fri Sep 2 16:13:41 2022 -0300
> >
> > Revert "libperf evlist: Allow mixing per-thread and per-cpu mmaps"
> >
> > This reverts commit ae4f8ae16a07896403c90305d4b9be27f657c1fc.
> >
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > ⬢[acme@toolbox perf-urgent]$
> >
> > It works again, Tomáš can you please try doing this to see if this works
> > for you?
> >
>
> This is the fix I have so far. I would like to test it some more though.
>
> From: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Date: Sat, 3 Sep 2022 10:05:08 +0300
> Subject: [PATCH] libperf evlist: Fix per-thread mmaps for multi-threaded
> targets
>
> Offending commit did not consider the different set-output rules for
> per-thread mmaps i.e. in the per-thread case set-output is used for
> mmaps of the same thread not the same cpu.
>
> This was not immediately noticed because it only happens with
> multi-threaded targets and we do not have a test for that yet.
>
> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>

SNIP

> static int
> mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> struct perf_mmap_param *mp)
> @@ -528,6 +571,8 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> int nr_mmaps = 0;
> int cpu, thread;
>
> + pr_debug("%s: nr cpu values %d nr threads %d\n", __func__, nr_cpus, nr_threads);
> +
> for (cpu = 0; cpu < nr_cpus; cpu++) {
> int output = -1;
> int output_overwrite = -1;
> @@ -569,6 +614,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
> struct perf_evlist_mmap_ops *ops,
> struct perf_mmap_param *mp)
> {
> + const struct perf_cpu_map *cpus = evlist->all_cpus;
> struct perf_evsel *evsel;
>
> if (!ops || !ops->get || !ops->mmap)
> @@ -588,6 +634,9 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
> if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
> return -ENOMEM;
>
> + if (perf_cpu_map__empty(cpus))
> + return mmap_per_thread(evlist, ops, mp);
> +

could we just enable per-cpu mapping in top? I'd think it should be
enabled anyway because we need ring buffers on all cpus for sampling

jirka


---
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index e89208b4ad4b..d6be82315dd8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1422,6 +1422,7 @@ int cmd_top(int argc, const char **argv)
.freq = 4000, /* 4 KHz */
.target = {
.uses_mmap = true,
+ .default_per_cpu = true,
},
/*
* FIXME: This will lose PERF_RECORD_MMAP and other metadata