Re: [PATCH v2] perf: Fix missing SIGTRAPs

From: Peter Zijlstra
Date: Tue Oct 11 2022 - 09:07:10 EST


On Tue, Oct 11, 2022 at 02:58:36PM +0200, Marco Elver wrote:
> On Tue, Oct 11, 2022 at 09:44AM +0200, Peter Zijlstra wrote:
> > Subject: perf: Fix missing SIGTRAPs
> > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Date: Thu Oct 6 15:00:39 CEST 2022
> >
> > Marco reported:
> >
> > Due to the implementation of how SIGTRAP are delivered if
> > perf_event_attr::sigtrap is set, we've noticed 3 issues:
> >
> > 1. Missing SIGTRAP due to a race with event_sched_out() (more
> > details below).
> >
> > 2. Hardware PMU events being disabled due to returning 1 from
> > perf_event_overflow(). The only way to re-enable the event is
> > for user space to first "properly" disable the event and then
> > re-enable it.
> >
> > 3. The inability to automatically disable an event after a
> > specified number of overflows via PERF_EVENT_IOC_REFRESH.
> >
> > The worst of the 3 issues is problem (1), which occurs when a
> > pending_disable is "consumed" by a racing event_sched_out(), observed
> > as follows:
> >
> > CPU0 | CPU1
> > --------------------------------+---------------------------
> > __perf_event_overflow() |
> > perf_event_disable_inatomic() |
> > pending_disable = CPU0 | ...
> > | _perf_event_enable()
> > | event_function_call()
> > | task_function_call()
> > | /* sends IPI to CPU0 */
> > <IPI> | ...
> > __perf_event_enable() +---------------------------
> > ctx_resched()
> > task_ctx_sched_out()
> > ctx_sched_out()
> > group_sched_out()
> > event_sched_out()
> > pending_disable = -1
> > </IPI>
> > <IRQ-work>
> > perf_pending_event()
> > perf_pending_event_disable()
> > /* Fails to send SIGTRAP because no pending_disable! */
> > </IRQ-work>
> >
> > In the above case, not only is that particular SIGTRAP missed, but also
> > all future SIGTRAPs because 'event_limit' is not reset back to 1.
> >
> > To fix, rework pending delivery of SIGTRAP via IRQ-work by introduction
> > of a separate 'pending_sigtrap', no longer using 'event_limit' and
> > 'pending_disable' for its delivery.
> >
> > Additionally; and different to Marco's proposed patch:
> >
> > - recognise that pending_disable effectively duplicates oncpu for
> > the case where it is set. As such, change the irq_work handler to
> > use ->oncpu to target the event and use pending_* as boolean toggles.
> >
> > - observe that SIGTRAP targets the ctx->task, so the context switch
> > optimization that carries contexts between tasks is invalid. If
> > the irq_work were delayed enough to hit after a context switch the
> > SIGTRAP would be delivered to the wrong task.
> >
> > - observe that if the event gets scheduled out
> > (rotation/migration/context-switch/...) the irq-work would be
> > insufficient to deliver the SIGTRAP when the event gets scheduled
> > back in (the irq-work might still be pending on the old CPU).
> >
> > Therefore have event_sched_out() convert the pending sigtrap into a
> > task_work which will deliver the signal at return_to_user.
> >
> > Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
> > Reported-by: Marco Elver <elver@xxxxxxxxxx>
> > Debugged-by: Marco Elver <elver@xxxxxxxxxx>
>
> Reviewed-by: Marco Elver <elver@xxxxxxxxxx>
> Tested-by: Marco Elver <elver@xxxxxxxxxx>
>
> .. fuzzing, and lots of concurrent sigtrap_threads with this patch:
>
> https://lore.kernel.org/all/20221011124534.84907-1-elver@xxxxxxxxxx/
>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>
> My original patch also attributed Dmitry:
>
> Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Debugged-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>
> ... we all melted our brains on this one. :-)
>
> Would be good to get the fix into one of the upcoming 6.1-rc.

Updated and yes, I'm planning on queueing this in perf/urgent the moment
-rc1 happens.

Thanks!