Re: [PATCH 2/2] perf tests sigtrap: Skip if running on a kernel with sleepable spinlocks

From: Marco Elver
Date: Thu Nov 30 2023 - 08:29:17 EST


On Thu, 30 Nov 2023 at 14:01, Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> Em Wed, Nov 29, 2023 at 05:42:30PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Nov 29, 2023 at 04:57:47PM +0100, Marco Elver escreveu:
> > > > @@ -175,7 +208,16 @@ static int run_stress_test(int fd, pthread_t *threads, pthread_barrier_t *barrie
> > > > ret = run_test_threads(threads, barrier);
> > > > TEST_ASSERT_EQUAL("disable failed", ioctl(fd, PERF_EVENT_IOC_DISABLE, 0), 0);
>
> > > > - TEST_ASSERT_EQUAL("unexpected sigtraps", ctx.signal_count, NUM_THREADS * ctx.iterate_on);
> > > > + expected_sigtraps = NUM_THREADS * ctx.iterate_on;
>
> > > > + if (ctx.signal_count < expected_sigtraps && kernel_with_sleepable_spinlocks()) {
> > > > + pr_debug("Expected %d sigtraps, got %d, running on a kernel with sleepable spinlocks.\n",
> > > > + expected_sigtraps, ctx.signal_count);
> > > > + pr_debug("See https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@xxxxxx/\n";);
>
> > > No changes from the RT side since? A fix exists, but apparently not
> > > good enough... Sigh.
>
> > Yeah, my impression, and first attempt at writing that patch wast that
> > no sigtraps were being sent, but then when I tried with a random, more
> > recent machine in the Red Hat labs, I got some signals, way less than
> > the expected ones, but some, maybe this is an interesting data point?
>
> > I'll try again to reproduce in the local machine, old i7 lenovo notebook
> > and at the newer machine, a Xeon(R) Silver 4216, 32 cpu and report here.
>
> So, on the i7 lenovo:
>
> [root@nine ~]# uname -a
> Linux nine 5.14.0-284.30.1.rt14.315.el9_2.x86_64 #1 SMP PREEMPT_RT Fri Aug 25 10:53:59 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
[...]
>
> I guess I'll try to get hold of the older kernel with 0 sigtraps to see
> if I get the same behaviour (consistent 0 sigtraps) on that kernel on
> the bigger machine :-\

Thanks for checking.

In any case, it looks like it's still broken. If the fix (bf9ad37dc8a
+ small diff by Mike) from [1] still works, what's blocking it from
being upstreamed?

https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@xxxxxx/