Re: 'perf test sigtrap' failing on PREEMPT_RT_FULL

From: Arnaldo Carvalho de Melo
Date: Wed Mar 06 2024 - 11:27:28 EST


On Wed, Mar 06, 2024 at 10:06:30AM -0300, Arnaldo Carvalho de Melo wrote:
> > In Thu, 4 Jan 2024 19:35:57 -0300, Arnaldo Carvalho de Melo wrote:
> > > +++ b/kernel/events/core.c
> > > @@ -6801,7 +6801,7 @@ static void perf_pending_task(struct callback_head *head)
> > > * If we 'fail' here, that's OK, it means recursion is already disabled
> > > * and we won't recurse 'further'.
> > > */
> > >- preempt_disable_notrace();
> > >+ migrate_disable();
> > > rctx = perf_swevent_get_recursion_context();
>
> > Pardon my ignorance, is it safe to call preempt_count() with preemption
> > enabled on PREEMPT_RT, or at least in the context being discussed here?
>
> > Because:
>
> > perf_swevent_get_recursion_context()
> > get_recursion_context()
> > interrupt_context_level()
> > preempt_count()
>
> > And:
>
> > int perf_swevent_get_recursion_context(void)
> > {
> > struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
> >
> > return get_recursion_context(swhash->recursion);
> > }
>
> Seems to be enough because perf_pending_task is a irq_work callback and

s/irq_work/task_work/ but that also doesn't reentry, I think

> that is guaranteed not to reentry?
>
> Artem's tests with a RHEL kernel seems to indicate that, ditto for my,
> will test with upstream linux-6.8.y-rt.
>
> But there is a lot more happening in perf_sigtrap and I'm not sure if
> the irq_work callback gets preempted we would not race with something
> else.
>
> Marco, Mike, ideas?

Looking at:

commit ca6c21327c6af02b7eec31ce4b9a740a18c6c13f
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Thu Oct 6 15:00:39 2022 +0200

perf: Fix missing SIGTRAPs

Marco reported:

Due to the implementation of how SIGTRAP are delivered if
perf_event_attr::sigtrap is set, we've noticed 3 issues:

1. Missing SIGTRAP due to a race with event_sched_out() (more
details below).

2. Hardware PMU events being disabled due to returning 1 from
perf_event_overflow(). The only way to re-enable the event is
for user space to first "properly" disable the event and then
re-enable it.

3. The inability to automatically disable an event after a
specified number of overflows via PERF_EVENT_IOC_REFRESH.

The worst of the 3 issues is problem (1), which occurs when a
pending_disable is "consumed" by a racing event_sched_out(), observed
as follows:

-------------------------------------------------------------

That its what introduces perf_pending_task(), I'm now unsure we can just
disable migration, as event_sched_out() seems to require being called
under a raw_spin_lock and that disables preemption...

- Arnaldo