[PATCH 1/4] perf: Fix use-after-free in perf_release()

From: Peter Zijlstra
Date: Thu Mar 16 2017 - 09:00:40 EST


Dmitry reported syzcaller tripped a use-after-free in perf_release().

After much puzzlement Oleg spotted the below scenario:


Task1 Task2

fork()
perf_event_init_task()
/* ... */
goto bad_fork_$foo;
/* ... */
perf_event_free_task()
mutex_lock(ctx->lock)
perf_free_event(B)

perf_event_release_kernel(A)
mutex_lock(A->child_mutex)
list_for_each_entry(child, ...) {
/* child == B */
ctx = B->ctx;
get_ctx(ctx);
mutex_unlock(A->child_mutex);

mutex_lock(A->child_mutex)
list_del_init(B->child_list)
mutex_unlock(A->child_mutex)

/* ... */

mutex_unlock(ctx->lock);
put_ctx() /* >0 */
free_task();
mutex_lock(ctx->lock);
mutex_lock(A->child_mutex);
/* ... */
mutex_unlock(A->child_mutex);
mutex_unlock(ctx->lock)
put_ctx() /* 0 */
ctx->task && !TOMBSTONE
put_task_struct() /* UAF */


This patch closes the hole by making perf_event_free_task() destroy the
task <-> ctx relation such that perf_event_release_kernel() will no longer
observe the now dead task.


Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Spotted-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20170314155949.GE32474@worktop
---
kernel/events/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10556,6 +10556,17 @@ void perf_event_free_task(struct task_st
continue;

mutex_lock(&ctx->mutex);
+ raw_spin_lock_irq(&ctx->lock);
+ /*
+ * Destroy the task <-> ctx relation and mark the context dead.
+ *
+ * This is important because even though the task hasn't been
+ * exposed yet the context has been (through child_list).
+ */
+ RCU_INIT_POINTER(task->perf_event_ctxp[ctxn], NULL);
+ WRITE_ONCE(ctx->task, TASK_TOMBSTONE);
+ put_task_struct(task); /* cannot be last */
+ raw_spin_unlock_irq(&ctx->lock);
again:
list_for_each_entry_safe(event, tmp, &ctx->pinned_groups,
group_entry)