[PATCH] tracing: fix UAF caused by memory ordering issue

From: Kairui Song
Date: Sun Nov 12 2023 - 10:03:49 EST


From: Kairui Song <kasong@xxxxxxxxxxx>

Following kernel panic was observed when doing ftrace stress test:

Unable to handle kernel paging request at virtual address 9699b0f8ece28240
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[9699b0f8ece28240] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core rfkill vfat fat loop fuse nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom crct10dif_ce ghash_ce sha2_ce virtio_gpu virtio_dma_buf drm_shmem_helper virtio_blk drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_console sha256_arm64 sha1_ce drm virtio_scsi i2c_core virtio_net net_failover failover virtio_mmio dm_multipath dm_mod autofs4 [last unloaded: ipmi_msghandler]
CPU: 0 PID: 499719 Comm: sh Kdump: loaded Not tainted 6.1.61+ #2
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __kmem_cache_alloc_node+0x1dc/0x2e4
lr : __kmem_cache_alloc_node+0xac/0x2e4
sp : ffff80000ad23aa0
x29: ffff80000ad23ab0 x28: 00000004052b8000 x27: ffffc513863b0000
x26: 0000000000000040 x25: ffffc51384f21ca4 x24: 00000000ffffffff
x23: d615521430b1b1a5 x22: ffffc51386044770 x21: 0000000000000000
x20: 0000000000000cc0 x19: ffff0000c0001200 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000aaaae65e1630
x14: 0000000000000004 x13: ffffc513863e67a0 x12: ffffc513863af6d8
x11: 0000000000000001 x10: ffff80000ad23aa0 x9 : ffffc51385058078
x8 : 0000000000000018 x7 : 0000000000000001 x6 : 0000000000000010
x5 : ffff0000c09c2280 x4 : ffffc51384f21ca4 x3 : 0000000000000040
x2 : 9699b0f8ece28240 x1 : ffff0000c09c2280 x0 : 9699b0f8ece28200
Call trace:
__kmem_cache_alloc_node+0x1dc/0x2e4
__kmalloc+0x6c/0x1c0
func_add+0x1a4/0x200
tracepoint_add_func+0x70/0x230
tracepoint_probe_register+0x6c/0xb4
trace_event_reg+0x8c/0xa0
__ftrace_event_enable_disable+0x17c/0x440
__ftrace_set_clr_event_nolock+0xe0/0x150
system_enable_write+0xe0/0x114
vfs_write+0xd0/0x2dc
ksys_write+0x78/0x110
__arm64_sys_write+0x24/0x30
invoke_syscall.constprop.0+0x58/0xf0
el0_svc_common.constprop.0+0x54/0x160
do_el0_svc+0x2c/0x60
el0_svc+0x40/0x1ac
el0t_64_sync_handler+0xf4/0x120
el0t_64_sync+0x19c/0x1a0
Code: b9402a63 f9405e77 8b030002 d5384101 (f8636803)

Panic was caused by corrupted freelist pointer. After more debugging,
I found the root cause is UAF of slab allocated object in ftrace
introduced by commit eecb91b9f98d ("tracing: Fix memleak due to race
between current_tracer and trace"), and so far it's only reproducible
on some ARM64 machines, the UAF and free stack is:

UAF:
kasan_report+0xa8/0x1bc
__asan_report_load8_noabort+0x28/0x3c
print_graph_function_flags+0x524/0x5a0
print_graph_function_event+0x28/0x40
print_trace_line+0x5c4/0x1030
s_show+0xf0/0x460
seq_read_iter+0x930/0xf5c
seq_read+0x130/0x1d0
vfs_read+0x288/0x840
ksys_read+0x130/0x270
__arm64_sys_read+0x78/0xac
invoke_syscall.constprop.0+0x90/0x224
do_el0_svc+0x118/0x3dc
el0_svc+0x54/0x120
el0t_64_sync_handler+0xf4/0x120
el0t_64_sync+0x19c/0x1a0

Freed by:
kasan_save_free_info+0x38/0x5c
__kasan_slab_free+0xe8/0x154
slab_free_freelist_hook+0xfc/0x1e0
__kmem_cache_free+0x138/0x260
kfree+0xd0/0x1d0
graph_trace_close+0x60/0x90
s_start+0x610/0x910
seq_read_iter+0x274/0xf5c
seq_read+0x130/0x1d0
vfs_read+0x288/0x840
ksys_read+0x130/0x270
__arm64_sys_read+0x78/0xac
invoke_syscall.constprop.0+0x90/0x224
do_el0_svc+0x118/0x3dc
el0_svc+0x54/0x120
el0t_64_sync_handler+0xf4/0x120
el0t_64_sync+0x19c/0x1a0

Despite the s_start and s_show being serialized by seq_file mutex,
the tracer struct copy in s_start introduced by the commit mentioned
above is not atomic nor guarenteened to be seen by all CPUs. So
following seneriao is possible (and actually happened):

CPU 1 CPU 2
seq_read_iter seq_read_iter
mutex_lock(&m->lock);
s_start
// iter->trace is graph_trace
iter->trace->close(iter);
graph_trace_close
kfree(data) <- *** data released here ***
// copy current_trace to iter->trace
// but not synced to CPU 2
*iter->trace = *tr->current_trace
... (goes on)
mutex_unlock(&m->lock);
mutex_lock(&m->lock);
... (s_start and other work)
s_show
print_trace_line(iter)
// iter->trace is still
// old value (graph_trace)
iter->trace->print_line()
print_graph_function_flags
data->cpu_data <- *** data UAF ***

The UAF corrupted the slab freelist and caused panic on another slab
allocation.

After applying the barrier, the problem is gone.

Fixes: eecb91b9f98d ("tracing: Fix memleak due to race between current_tracer and trace")
Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx>
---
kernel/trace/trace.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9aebf904ff97..c377cdf3701b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4182,11 +4182,14 @@ static void *s_start(struct seq_file *m, loff_t *pos)
int cpu;

mutex_lock(&trace_types_lock);
- if (unlikely(tr->current_trace != iter->trace)) {
- /* Close iter->trace before switching to the new current tracer */
- if (iter->trace->close)
- iter->trace->close(iter);
- iter->trace = tr->current_trace;
+ if (unlikely(tr->current_trace && iter->trace->name != tr->current_trace->name)) {
+ /* Switch to the new current tracer then close old tracer */
+ struct tracer *prev_trace = iter->trace;
+ *iter->trace = *tr->current_trace;
+ /* Make sure the switch is seen by all CPUs before closing */
+ smp_wmb();
+ if (prev_trace->close)
+ prev_trace->close(iter);
/* Reopen the new current tracer */
if (iter->trace->open)
iter->trace->open(iter);
--
2.42.0