Re: [PATCH 2/2] perf tests sigtrap: Skip if running on a kernel with sleepable spinlocks

From: Arnaldo Carvalho de Melo
Date: Wed Nov 29 2023 - 15:42:39 EST


Em Wed, Nov 29, 2023 at 04:57:47PM +0100, Marco Elver escreveu:
> On Wed, 29 Nov 2023 at 16:47, Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> >
> > From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> >
> > There are issues as reported that need some more investigation on the
> > RT kernel front, till that is addressed, skip this test.
> >
> > This test is already skipped for multiple hardware architectures where
> > the tested kernel feature is not supported.
> >
> > Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> > Cc: Clark Williams <williams@xxxxxxxxxx>
> > Cc: Ian Rogers <irogers@xxxxxxxxxx>
> > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
> > Cc: Marco Elver <elver@xxxxxxxxxx>
> > Cc: Mike Galbraith <efault@xxxxxx>
> > Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Link: https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@xxxxxx/
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> Acked-by: Marco Elver <elver@xxxxxxxxxx>
>
> > ---
> > tools/perf/tests/sigtrap.c | 46 ++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 44 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/tests/sigtrap.c b/tools/perf/tests/sigtrap.c
> > index a1bc7c776254ed2f..e6fd934b027a3d0c 100644
> > --- a/tools/perf/tests/sigtrap.c
> > +++ b/tools/perf/tests/sigtrap.c
> > @@ -103,6 +103,34 @@ static bool attr_has_sigtrap(void)
> >
> > return __btf_type__find_member_by_name(id, "sigtrap") != NULL;
> > }
> > +
> > +static bool kernel_with_sleepable_spinlocks(void)
> > +{
> > + const struct btf_member *member;
> > + const struct btf_type *type;
> > + const char *type_name;
> > + int id;
> > +
> > + if (!btf__available())
> > + return false;
> > +
> > + id = btf__find_by_name_kind(btf, "spinlock", BTF_KIND_STRUCT);
> > + if (id < 0)
> > + return false;
> > +
> > + // Only RT has a "lock" member for "struct spinlock"
> > + member = __btf_type__find_member_by_name(id, "lock");
> > + if (member == NULL)
> > + return false;
> > +
> > + // But check its type as well
> > + type = btf__type_by_id(btf, member->type);
> > + if (!type || !btf_is_struct(type))
> > + return false;
> > +
> > + type_name = btf__name_by_offset(btf, type->name_off);
> > + return type_name && !strcmp(type_name, "rt_mutex_base");
> > +}
> > #else /* !HAVE_BPF_SKEL */
> > static bool attr_has_sigtrap(void)
> > {
> > @@ -125,6 +153,11 @@ static bool attr_has_sigtrap(void)
> > return ret;
> > }
> >
> > +static bool kernel_with_sleepable_spinlocks(void)
> > +{
> > + return false;
> > +}
> > +
> > static void btf__exit(void)
> > {
> > }
> > @@ -166,7 +199,7 @@ static int run_test_threads(pthread_t *threads, pthread_barrier_t *barrier)
> >
> > static int run_stress_test(int fd, pthread_t *threads, pthread_barrier_t *barrier)
> > {
> > - int ret;
> > + int ret, expected_sigtraps;
> >
> > ctx.iterate_on = 3000;
> >
> > @@ -175,7 +208,16 @@ static int run_stress_test(int fd, pthread_t *threads, pthread_barrier_t *barrie
> > ret = run_test_threads(threads, barrier);
> > TEST_ASSERT_EQUAL("disable failed", ioctl(fd, PERF_EVENT_IOC_DISABLE, 0), 0);
> >
> > - TEST_ASSERT_EQUAL("unexpected sigtraps", ctx.signal_count, NUM_THREADS * ctx.iterate_on);
> > + expected_sigtraps = NUM_THREADS * ctx.iterate_on;
> > +
> > + if (ctx.signal_count < expected_sigtraps && kernel_with_sleepable_spinlocks()) {
> > + pr_debug("Expected %d sigtraps, got %d, running on a kernel with sleepable spinlocks.\n",
> > + expected_sigtraps, ctx.signal_count);
> > + pr_debug("See https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@xxxxxx/\n";);
>
> No changes from the RT side since? A fix exists, but apparently not
> good enough... Sigh.

Yeah, my impression, and first attempt at writing that patch wast that
no sigtraps were being sent, but then when I tried with a random, more
recent machine in the Red Hat labs, I got some signals, way less than
the expected ones, but some, maybe this is an interesting data point?

I'll try again to reproduce in the local machine, old i7 lenovo notebook
and at the newer machine, a Xeon(R) Silver 4216, 32 cpu and report here.

- Arnaldo