Re: [PATCH] perf test: Fix session topology test on heterogeneous systems

From: Ian Rogers
Date: Mon Jan 22 2024 - 19:46:41 EST


On Mon, Jan 22, 2024 at 9:09 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> Hi James, I think the subject should be something like "perf evlist:
> Fix new_default for >1 core PMU" as the change will apply more widely
> than just the test. The test failure fix can be in the subject. You
> could add a:
>
> Closes: https://lore.kernel.org/lkml/CAP-5=fWVQ-7ijjK3-w1q+k2WYVNHbAcejb-xY0ptbjRw476VKA@xxxxxxxxxxxxxx/
>
> On Mon, Jan 22, 2024 at 7:55 AM James Clark <james.clark@xxxxxxx> wrote:
> >
> > The test currently fails with this message when evlist__new_default()
> > opens more than one event:
> >
> > 32: Session topology :
> > --- start ---
> > templ file: /tmp/perf-test-vv5YzZ
> > Using CPUID 0x00000000410fd070
> > Opening: unknown-hardware:HG
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 0 (PERF_TYPE_HARDWARE)
> > config 0xb00000000
> > disabled 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 4
> > Opening: unknown-hardware:HG
> > ------------------------------------------------------------
> > perf_event_attr:
> > type 0 (PERF_TYPE_HARDWARE)
> > config 0xa00000000
> > disabled 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 5
> > non matching sample_type
> > FAILED tests/topology.c:73 can't get session
> > ---- end ----
> > Session topology: FAILED!
> >
> > This is because when re-opening the file and parsing the header, Perf
> > expects that any file that has more than one event has the session ID
> > flag set. Perf record already sets the flag in a similar way when there
> > is more than one event, so add the same logic to evlist__new_default().
> >
> > evlist__new_default() is only currently used in tests, so I don't
> > expect this change to have any other side effects.
> >
> > The session topology test has been failing on Arm big.LITTLE platforms
> > since commit 251aa040244a ("perf parse-events: Wildcard most
> > "numeric" events") when evlist__new_default() started opening multiple
> > events for 'cycles'.
> >
> > Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> > Signed-off-by: James Clark <james.clark@xxxxxxx>
> > ---
> > tools/perf/util/evlist.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> > index 95f25e9fb994..56db37fac6f6 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -95,6 +95,7 @@ struct evlist *evlist__new_default(void)
> > struct evlist *evlist = evlist__new();
> > bool can_profile_kernel;
> > int err;
> > + struct evsel *evsel;
> >
> > if (!evlist)
> > return NULL;
> > @@ -106,6 +107,10 @@ struct evlist *evlist__new_default(void)
> > evlist = NULL;
> > }
> >
> > + if (evlist->core.nr_entries > 1)
> > + evlist__for_each_entry(evlist, evsel)
> > + evsel__set_sample_id(evsel, false);
> > +
>
> nit: the if should have curlies, with them we can reduce the scope of
> evsel like below. It is also nice for constants to name the arguments
> [1].
>
> if (evlist->core.nr_entries > 1) {
> struct evsel *evsel;
>
> evlist__for_each_entry(evlist, evsel)
> evsel__set_sample_id(evsel, /*can_sample_identifier=*/false);
> }
>
> Tested-by: Ian Rogers <irogers@xxxxxxxxxx>
> (also Reviewed-by)
>
> When testing with this with Mark's change [2] I see on alderlake two failures:
> ```
> irogers@alderlake:~$ perf test 74 -vv
> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> BPF maps, etc
> 74: daemon operations :
> --- start ---
> test child forked, pid 553821
> test daemon list
> test daemon reconfig
> test daemon stop
> test daemon signal
> signal 12 sent to session 'test [554082]'
> signal 12 sent to session 'test [554082]'
> FAILED: perf data no generated
> test daemon ping
> test daemon lock
> test child finished with -1
> ---- end ----
> daemon operations: FAILED!
> irogers@alderlake:~$ perf test 76 -vv
> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> BPF maps, etc
> 76: perf list tests :
> --- start ---
> test child forked, pid 554167
> Json output test
> Expecting ',' delimiter: line 4971 column 2 (char 243497)
> test child finished with -1
> ---- end ----
> perf list tests: FAILED!
> ```
> So I think this patch may be exposing other latent issues. I'll try to
> take a look.

Unrelated issues to this patch, fixes in:
https://lore.kernel.org/lkml/20240123000604.1211486-1-irogers@xxxxxxxxxx/

Thanks,
Ian

> Another thought, rather than having an evlist validate we should just
> assert the evlist is always in a good shape whenever it is mutated.
> That would have avoided this bug as the code would have blown up
> early.
>
> Thanks,
> Ian
>
> [1] https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html
> [2] https://lore.kernel.org/lkml/20240116170348.463479-1-mark.rutland@armcom/
>
> > return evlist;
> > }
>
>
> >
> > --
> > 2.34.1
> >