Re: possible deadlock in perf_trace_destroy (2)

From: Dmitry Vyukov
Date: Tue Nov 07 2017 - 06:42:07 EST


On Tue, Nov 7, 2017 at 12:39 PM, syzbot
<bot+c9695f0404c72c8f0b7abbac660d986af9008fc5@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote:
> Hello,
>
> syzkaller hit the following crash on
> 5a3517e009e979f21977d362212b7729c5165d92
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
>
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.14.0-rc7-next-20171103+ #10 Not tainted
> ------------------------------------------------------
> syz-executor1/3061 is trying to acquire lock:
> (event_mutex){+.+.}, at: [<ffffffff81776de8>] perf_trace_destroy+0x28/0x100
> kernel/trace/trace_event_perf.c:234
>
> but task is already holding lock:
> (&mm->mmap_sem){++++}, at: [<ffffffff819448b8>] vm_mmap_pgoff+0x198/0x280
> mm/util.c:331
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #7 (&mm->mmap_sem){++++}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> __might_fault+0x13a/0x1d0 mm/memory.c:4530
> _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
> copy_to_user include/linux/uaccess.h:155 [inline]
> filldir+0x1a7/0x320 fs/readdir.c:196
> dir_emit_dot include/linux/fs.h:3343 [inline]
> dir_emit_dots include/linux/fs.h:3354 [inline]
> dcache_readdir+0x12d/0x5e0 fs/libfs.c:192
> iterate_dir+0x1ca/0x540 fs/readdir.c:51
> SYSC_getdents fs/readdir.c:231 [inline]
> SyS_getdents+0x225/0x450 fs/readdir.c:212
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #6 (&sb->s_type->i_mutex_key#5){++++}:
> rq_lock kernel/sched/sched.h:1747 [inline]
> ttwu_queue kernel/sched/core.c:1860 [inline]
> try_to_wake_up+0xa53/0x1850 kernel/sched/core.c:2075
>
> -> #5 ((completion)&req.done){+.+.}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> complete_acquire include/linux/completion.h:40 [inline]
> __wait_for_common kernel/sched/completion.c:109 [inline]
> wait_for_common kernel/sched/completion.c:123 [inline]
> wait_for_completion+0xcb/0x7b0 kernel/sched/completion.c:144
> devtmpfs_create_node+0x32b/0x4a0 drivers/base/devtmpfs.c:115
> device_add+0x120f/0x1640 drivers/base/core.c:1824
> device_create_groups_vargs+0x1f3/0x250 drivers/base/core.c:2430
> device_create_vargs drivers/base/core.c:2470 [inline]
> device_create+0xda/0x110 drivers/base/core.c:2506
> msr_device_create+0x26/0x40 arch/x86/kernel/msr.c:188
> cpuhp_invoke_callback+0x2ea/0x1d20 kernel/cpu.c:182
> cpuhp_thread_fun+0x48b/0x7e0 kernel/cpu.c:571
> smpboot_thread_fn+0x450/0x7c0 kernel/smpboot.c:164
> kthread+0x3c9/0x4b0 kernel/kthread.c:242
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432
>
> -> #4 (cpuhp_state-up){+.+.}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> cpuhp_lock_acquire kernel/cpu.c:85 [inline]
> cpuhp_invoke_ap_callback kernel/cpu.c:605 [inline]
> cpuhp_issue_call+0x1e5/0x520 kernel/cpu.c:1495
> __cpuhp_setup_state_cpuslocked+0x2c7/0x5f0 kernel/cpu.c:1642
> __cpuhp_setup_state+0xb0/0x140 kernel/cpu.c:1671
> cpuhp_setup_state include/linux/cpuhotplug.h:200 [inline]
> page_writeback_init+0x4d/0x71 mm/page-writeback.c:2083
> pagecache_init+0x48/0x4f mm/filemap.c:977
> start_kernel+0x6bc/0x74f init/main.c:690
> x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
> x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
> secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
>
> -> #3 (cpuhp_state_mutex){+.+.}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x19d0 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> __cpuhp_setup_state_cpuslocked+0x5b/0x5f0 kernel/cpu.c:1617
> __cpuhp_setup_state+0xb0/0x140 kernel/cpu.c:1671
> cpuhp_setup_state_nocalls include/linux/cpuhotplug.h:228 [inline]
> kvm_guest_init+0x1f3/0x20f arch/x86/kernel/kvm.c:495
> setup_arch+0x17cb/0x19e5 arch/x86/kernel/setup.c:1264
> start_kernel+0xa5/0x74f init/main.c:530
> x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
> x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
> secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:240
>
> -> #2 (cpu_hotplug_lock.rw_sem){++++}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36
> [inline]
> percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
> cpus_read_lock+0x42/0x90 kernel/cpu.c:293
> static_key_slow_inc+0x9d/0x3c0 kernel/jump_label.c:123
> tracepoint_add_func kernel/tracepoint.c:223 [inline]
> tracepoint_probe_register_prio+0x80d/0x9a0 kernel/tracepoint.c:283
> tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
> trace_event_reg+0x167/0x320 kernel/trace/trace_events.c:305
> perf_trace_event_reg kernel/trace/trace_event_perf.c:122 [inline]
> perf_trace_event_init kernel/trace/trace_event_perf.c:197 [inline]
> perf_trace_init+0x4ef/0xab0 kernel/trace/trace_event_perf.c:221
> perf_tp_event_init+0x7d/0xf0 kernel/events/core.c:7953
> perf_try_init_event+0xc9/0x1f0 kernel/events/core.c:9179
> perf_init_event kernel/events/core.c:9217 [inline]
> perf_event_alloc+0x1c5b/0x2a00 kernel/events/core.c:9481
> SYSC_perf_event_open+0x842/0x2f10 kernel/events/core.c:9936
> SyS_perf_event_open+0x39/0x50 kernel/events/core.c:9822
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #1 (tracepoints_mutex){+.+.}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x19d0 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> tracepoint_probe_register_prio+0xa0/0x9a0 kernel/tracepoint.c:279
> tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
> trace_event_reg+0x167/0x320 kernel/trace/trace_events.c:305
> perf_trace_event_reg kernel/trace/trace_event_perf.c:122 [inline]
> perf_trace_event_init kernel/trace/trace_event_perf.c:197 [inline]
> perf_trace_init+0x4ef/0xab0 kernel/trace/trace_event_perf.c:221
> perf_tp_event_init+0x7d/0xf0 kernel/events/core.c:7953
> perf_try_init_event+0xc9/0x1f0 kernel/events/core.c:9179
> perf_init_event kernel/events/core.c:9217 [inline]
> perf_event_alloc+0x1c5b/0x2a00 kernel/events/core.c:9481
> SYSC_perf_event_open+0x842/0x2f10 kernel/events/core.c:9936
> SyS_perf_event_open+0x39/0x50 kernel/events/core.c:9822
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #0 (event_mutex){+.+.}:
> check_prevs_add kernel/locking/lockdep.c:2031 [inline]
> validate_chain kernel/locking/lockdep.c:2473 [inline]
> __lock_acquire+0x3374/0x4770 kernel/locking/lockdep.c:3500
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x19d0 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234
> tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7937
> _free_event+0x3bd/0x10f0 kernel/events/core.c:4105
> put_event+0x24/0x30 kernel/events/core.c:4188
> perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5147
> remove_vma+0xb4/0x1b0 mm/mmap.c:172
> remove_vma_list mm/mmap.c:2475 [inline]
> do_munmap+0x82a/0xdf0 mm/mmap.c:2714
> mmap_region+0x59e/0x15a0 mm/mmap.c:1631
> do_mmap+0x6a1/0xd50 mm/mmap.c:1468
> do_mmap_pgoff include/linux/mm.h:2200 [inline]
> vm_mmap_pgoff+0x1de/0x280 mm/util.c:333
> SYSC_mmap_pgoff mm/mmap.c:1518 [inline]
> SyS_mmap_pgoff+0x23b/0x5f0 mm/mmap.c:1476
> SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
> SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> other info that might help us debug this:
>
> Chain exists of:
> event_mutex --> &sb->s_type->i_mutex_key#5 --> &mm->mmap_sem
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&mm->mmap_sem);
> lock(&sb->s_type->i_mutex_key#5);
> lock(&mm->mmap_sem);
> lock(event_mutex);
>
> *** DEADLOCK ***
>
> 1 lock held by syz-executor1/3061:
> #0: (&mm->mmap_sem){++++}, at: [<ffffffff819448b8>]
> vm_mmap_pgoff+0x198/0x280 mm/util.c:331
>
> stack backtrace:
> CPU: 0 PID: 3061 Comm: syz-executor1 Not tainted 4.14.0-rc7-next-20171103+
> #10
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:17 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:53
> print_circular_bug+0x503/0x710 kernel/locking/lockdep.c:1271
> check_prev_add+0x8b1/0x1580 kernel/locking/lockdep.c:1914
> check_prevs_add kernel/locking/lockdep.c:2031 [inline]
> validate_chain kernel/locking/lockdep.c:2473 [inline]
> __lock_acquire+0x3374/0x4770 kernel/locking/lockdep.c:3500
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x19d0 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234
> tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7937
> _free_event+0x3bd/0x10f0 kernel/events/core.c:4105
> put_event+0x24/0x30 kernel/events/core.c:4188
> perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5147
> remove_vma+0xb4/0x1b0 mm/mmap.c:172
> remove_vma_list mm/mmap.c:2475 [inline]
> do_munmap+0x82a/0xdf0 mm/mmap.c:2714
> mmap_region+0x59e/0x15a0 mm/mmap.c:1631
> do_mmap+0x6a1/0xd50 mm/mmap.c:1468
> do_mmap_pgoff include/linux/mm.h:2200 [inline]
> vm_mmap_pgoff+0x1de/0x280 mm/util.c:333
> SYSC_mmap_pgoff mm/mmap.c:1518 [inline]
> SyS_mmap_pgoff+0x23b/0x5f0 mm/mmap.c:1476
> SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
> SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
> entry_SYSCALL_64_fastpath+0x1f/0xbe
> RIP: 0033:0x447c89
> RSP: 002b:00007f49b0affbd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
> RAX: ffffffffffffffda RBX: 00007f49b0b006cc RCX: 0000000000447c89
> RDX: 0000000000000000 RSI: 0000000000003000 RDI: 0000000020007000
> RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000032 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: 00007f49b0b009c0 R15: 00007f49b0b00700


Set of mutexes looks somewhat similar to "possible deadlock in
lru_add_drain_all":
https://groups.google.com/forum/#!topic/syzkaller-bugs/uZpWvxGXT68
But this one has perf/tracepoints involved.
+Peter


> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkaller@xxxxxxxxxxxxxxxxx
> Please credit me with: Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxxx>
>
> syzbot will keep track of this bug report.
> Once a fix for this bug is committed, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/001a114aa95c553065055d6307e2%40google.com.
> For more options, visit https://groups.google.com/d/optout.