Re: raw_tp+cookie is buggy. Was: [syzbot] [bpf?] [trace?] KASAN: slab-use-after-free Read in bpf_trace_run1

From: Andrii Nakryiko
Date: Mon Mar 25 2024 - 19:59:35 EST


On Mon, Mar 25, 2024 at 10:27 AM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Sun, Mar 24, 2024 at 5:07 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > Hi Andrii,
> >
> > syzbot found UAF in raw_tp cookie series in bpf-next.
> > Reverting the whole merge
> > 2e244a72cd48 ("Merge branch 'bpf-raw-tracepoint-support-for-bpf-cookie'")
> >
> > fixes the issue.
> >
> > Pls take a look.
> > See C reproducer below. It splats consistently with CONFIG_KASAN=y
> >
> > Thanks.
>
> Will do, traveling today, so will be offline for a bit, but will check
> first thing afterwards.
>

Ok, so I don't think it's bpf_raw_tp_link specific, it should affect a
bunch of other links (unless I missed something). Basically, when last
link refcnt drops, we detach, do bpf_prog_put() and then proceed to
kfree link itself synchronously. But that link can still be referred
from running BPF program (I think multi-kprobe/multi-uprobe use it for
cookies, raw_tp with my changes started using link at runtime, there
are probably more types), and so if we free this memory synchronously,
we can have UAF.

We should do what we do for bpf_maps and delay freeing, the only
question is how tunable that freeing can be? Always do call_rcu()?
Always call_rcu_tasks_trace() (relevant for sleepable multi-uprobes)?
Should we allow synchronous free if link is not directly accessible
from program during its run?

Anyway, I sent a fix as an RFC so we can discuss.

> >
> > On Sun, Mar 24, 2024 at 4:28 PM syzbot
> > <syzbot+981935d9485a560bfbcb@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 520fad2e3206 selftests/bpf: scale benchmark counting by us..
> > > git tree: bpf-next
> > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=105af946180000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=981935d9485a560bfbcb
> > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=114f17a5180000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=162bb7a5180000
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/4eef3506c5ce/disk-520fad2e.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/24d60ebe76cc/vmlinux-520fad2e.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/8f883e706550/bzImage-520fad2e.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+981935d9485a560bfbcb@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > ==================================================================
> > > BUG: KASAN: slab-use-after-free in __bpf_trace_run kernel/trace/bpf_trace.c:2376 [inline]
> > > BUG: KASAN: slab-use-after-free in bpf_trace_run1+0xcb/0x510 kernel/trace/bpf_trace.c:2430
> > > Read of size 8 at addr ffff8880290d9918 by task migration/0/19
> > >
> > > CPU: 0 PID: 19 Comm: migration/0 Not tainted 6.8.0-syzkaller-05233-g520fad2e3206 #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
> > > Stopper: 0x0 <- 0x0
> > > Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
> > > print_address_description mm/kasan/report.c:377 [inline]
> > > print_report+0x169/0x550 mm/kasan/report.c:488
> > > kasan_report+0x143/0x180 mm/kasan/report.c:601
> > > __bpf_trace_run kernel/trace/bpf_trace.c:2376 [inline]
> > > bpf_trace_run1+0xcb/0x510 kernel/trace/bpf_trace.c:2430
> > > __traceiter_rcu_utilization+0x74/0xb0 include/trace/events/rcu.h:27
> > > trace_rcu_utilization+0x194/0x1c0 include/trace/events/rcu.h:27
> > > rcu_note_context_switch+0xc7c/0xff0 kernel/rcu/tree_plugin.h:360
> > > __schedule+0x345/0x4a20 kernel/sched/core.c:6635
> > > __schedule_loop kernel/sched/core.c:6813 [inline]
> > > schedule+0x14b/0x320 kernel/sched/core.c:6828
> > > smpboot_thread_fn+0x61e/0xa30 kernel/smpboot.c:160
> > > kthread+0x2f0/0x390 kernel/kthread.c:388
> > > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
> > > </TASK>
> > >
> > > Allocated by task 5075:
> > > kasan_save_stack mm/kasan/common.c:47 [inline]
> > > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > > poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
> > > __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
> > > kasan_kmalloc include/linux/kasan.h:211 [inline]
> > > kmalloc_trace+0x1d9/0x360 mm/slub.c:4012
> > > kmalloc include/linux/slab.h:590 [inline]
> > > kzalloc include/linux/slab.h:711 [inline]
> > > bpf_raw_tp_link_attach+0x2a0/0x6e0 kernel/bpf/syscall.c:3816
> > > bpf_raw_tracepoint_open+0x1c2/0x240 kernel/bpf/syscall.c:3863
> > > __sys_bpf+0x3c0/0x810 kernel/bpf/syscall.c:5673
> > > __do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
> > > __se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
> > > __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736
> > > do_syscall_64+0xfb/0x240
> > > entry_SYSCALL_64_after_hwframe+0x6d/0x75
> > >
> > > Freed by task 5075:
> > > kasan_save_stack mm/kasan/common.c:47 [inline]
> > > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
> > > kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:589
> > > poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
> > > __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
> > > kasan_slab_free include/linux/kasan.h:184 [inline]
> > > slab_free_hook mm/slub.c:2121 [inline]
> > > slab_free mm/slub.c:4299 [inline]
> > > kfree+0x14a/0x380 mm/slub.c:4409
> > > bpf_link_release+0x3b/0x50 kernel/bpf/syscall.c:3071
> > > __fput+0x429/0x8a0 fs/file_table.c:423
> > > task_work_run+0x24f/0x310 kernel/task_work.c:180
> > > exit_task_work include/linux/task_work.h:38 [inline]
> > > do_exit+0xa1b/0x27e0 kernel/exit.c:878
> > > do_group_exit+0x207/0x2c0 kernel/exit.c:1027
> > > __do_sys_exit_group kernel/exit.c:1038 [inline]
> > > __se_sys_exit_group kernel/exit.c:1036 [inline]
> > > __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1036
> > > do_syscall_64+0xfb/0x240
> > > entry_SYSCALL_64_after_hwframe+0x6d/0x75
> > >
> > > The buggy address belongs to the object at ffff8880290d9900
> > > which belongs to the cache kmalloc-128 of size 128
> > > The buggy address is located 24 bytes inside of
> > > freed 128-byte region [ffff8880290d9900, ffff8880290d9980)
> > >
> > > The buggy address belongs to the physical page:
> > > page:ffffea0000a43640 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x290d9
> > > anon flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
> > > page_type: 0xffffffff()
> > > raw: 00fff00000000800 ffff888014c418c0 0000000000000000 0000000000000001
> > > raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
> > > page dumped because: kasan: bad access detected
> > > page_owner tracks the page as allocated
> > > page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 4527, tgid 4527 (udevd), ts 43150902736, free_ts 43094996342
> > > set_page_owner include/linux/page_owner.h:31 [inline]
> > > post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
> > > prep_new_page mm/page_alloc.c:1540 [inline]
> > > get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
> > > __alloc_pages+0x256/0x680 mm/page_alloc.c:4569
> > > __alloc_pages_node include/linux/gfp.h:238 [inline]
> > > alloc_pages_node include/linux/gfp.h:261 [inline]
> > > alloc_slab_page+0x5f/0x160 mm/slub.c:2190
> > > allocate_slab mm/slub.c:2354 [inline]
> > > new_slab+0x84/0x2f0 mm/slub.c:2407
> > > ___slab_alloc+0xd1b/0x13e0 mm/slub.c:3540
> > > __slab_alloc mm/slub.c:3625 [inline]
> > > __slab_alloc_node mm/slub.c:3678 [inline]
> > > slab_alloc_node mm/slub.c:3850 [inline]
> > > kmalloc_trace+0x267/0x360 mm/slub.c:4007
> > > kmalloc include/linux/slab.h:590 [inline]
> > > kzalloc include/linux/slab.h:711 [inline]
> > > kernfs_get_open_node fs/kernfs/file.c:523 [inline]
> > > kernfs_fop_open+0x803/0xcd0 fs/kernfs/file.c:691
> > > do_dentry_open+0x907/0x15a0 fs/open.c:956
> > > do_open fs/namei.c:3643 [inline]
> > > path_openat+0x2860/0x3240 fs/namei.c:3800
> > > do_filp_open+0x235/0x490 fs/namei.c:3827
> > > do_sys_openat2+0x13e/0x1d0 fs/open.c:1407
> > > do_sys_open fs/open.c:1422 [inline]
> > > __do_sys_openat fs/open.c:1438 [inline]
> > > __se_sys_openat fs/open.c:1433 [inline]
> > > __x64_sys_openat+0x247/0x2a0 fs/open.c:1433
> > > do_syscall_64+0xfb/0x240
> > > entry_SYSCALL_64_after_hwframe+0x6d/0x75
> > > page last free pid 4526 tgid 4526 stack trace:
> > > reset_page_owner include/linux/page_owner.h:24 [inline]
> > > free_pages_prepare mm/page_alloc.c:1140 [inline]
> > > free_unref_page_prepare+0x968/0xa90 mm/page_alloc.c:2346
> > > free_unref_page+0x37/0x3f0 mm/page_alloc.c:2486
> > > rcu_do_batch kernel/rcu/tree.c:2196 [inline]
> > > rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2471
> > > __do_softirq+0x2bc/0x943 kernel/softirq.c:554
> > >
> > > Memory state around the buggy address:
> > > ffff8880290d9800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ffff8880290d9880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > >ffff8880290d9900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ^
> > > ffff8880290d9980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > ffff8880290d9a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ==================================================================
> > >
> > >
> > > ---
> > > This report is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.
> > >
> > > syzbot will keep track of this issue. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > >
> > > If the report is already addressed, let syzbot know by replying with:
> > > #syz fix: exact-commit-title
> > >
> > > If you want syzbot to run the reproducer, reply with:
> > > #syz test: git://repo/address.git branch-or-commit-hash
> > > If you attach or paste a git patch, syzbot will apply it before testing.
> > >
> > > If you want to overwrite report's subsystems, reply with:
> > > #syz set subsystems: new-subsystem
> > > (See the list of subsystem names on the web dashboard)
> > >
> > > If the report is a duplicate of another one, reply with:
> > > #syz dup: exact-subject-of-another-report
> > >
> > > If you want to undo deduplication, reply with:
> > > #syz undup