[RFC 0/3] Fixing i915 PMU use after free after driver unbind

From: Tvrtko Ursulin
Date: Mon Jan 15 2024 - 12:01:44 EST


From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Hi Peter, all,

This is an early RFC to outline a newly discovered problem in the current
handling of driver unbind with active perf fds.

The sequence is basically this:

1. Open a perf fd
2. Unbind a driver
3. Close the dangling fd

Or a slightly more evil variant:

1. Open a perf fd
2. Unbind a driver
3. Bind the driver again
4. Close the dangling fd

I thought we had this covered by recording the unbound status (pmu->closed in
i915_pmu.c) and making sure the struct i915_pmu (and struct perf_pmu) remain
active until the last event is closed (via internal reference counting). But
what I missed until now are two things:

1)
core.c: _free_event() will dereference event->pmu _after_ event->destroy().

KASAN catches this easily and patches 1 & 2 are the attempt to fix that.

2)
A more evil case where pmu->cpu_pmu_context per-cpu allocation gets re-used
_before_ the old perf fd is closed.

There things can nicely explode on the list_del_init inside event_sched_out on
list_del_init(&event->active_list); (with list debugging turned on of course).

Most easily reproducible by simply re-binding i915, which happens to grab the
same per-cpu block and then the new perf_pmu_register zaps the list_head which
the old event will try to unlink itself from.

This is what the third patch attempts to deal with. It is a bit incomplete
though, as I was unsure what is the best approach to fix and so thought to send
it out early for some guidance.

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@xxxxxxxxx>
Cc: Aravind Iddamsetty <aravind.iddamsetty@xxxxxxxxxxxxxxx>

Tvrtko Ursulin (3):
perf: Add new late event free callback
drm/i915/pmu: Move i915 reference drop to new event->free()
perf: Reference count struct perf_cpu_pmu_context to fix driver unbind

drivers/gpu/drm/i915/i915_pmu.c | 4 ++--
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 34 ++++++++++++++++++++++++---------
3 files changed, 29 insertions(+), 11 deletions(-)

--
2.40.1