[PATCH 3/4] perf: Fix event leak upon exit

From: Frederic Weisbecker
Date: Fri Mar 29 2024 - 19:59:22 EST


When a task is scheduled out, pending sigtrap deliveries are deferred
to the target task upon resume to userspace via task_work.

However failures while adding en event's callback to the task_work
engine are ignored. And since the last call for events exit happen
after task work is eventually closed, there is a small window during
which pending sigtrap can be queued though ignored, leaking the event
refcount addition such as in the following scenario:

TASK A
-----

do_exit()
exit_task_work(tsk);

<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2

As a result the event is never released while the task exits.

Fix this with appropriate task_work_add()'s error handling.

Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
---
kernel/events/core.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 724e6d7e128f..c1632e69c69d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2289,10 +2289,11 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
event->pending_sigtrap = 0;
if (state != PERF_EVENT_STATE_OFF &&
!event->pending_work) {
- event->pending_work = 1;
- dec = false;
- WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
- task_work_add(current, &event->pending_task, TWA_RESUME);
+ if (task_work_add(current, &event->pending_task, TWA_RESUME) >= 0) {
+ WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
+ dec = false;
+ event->pending_work = 1;
+ }
}
if (dec)
local_dec(&event->ctx->nr_pending);
--
2.44.0