Re: WARNING in loop_add

From: Yu Kuai
Date: Wed Nov 02 2022 - 08:34:37 EST




在 2022/11/02 15:02, Wei Chen 写道:
Dear Linux developers,

The bug persists in the upstream Linux v6.0.0 4fe89d07dcc2 and the
latest commit Linux v5.19.76 4f5365f77018.

[ 68.027515][ C0] ======================================================
[ 68.027977][ C0] WARNING: possible circular locking dependency detected
[ 68.028436][ C0] 6.0.0 #35 Not tainted
[ 68.028704][ C0] ------------------------------------------------------
[ 68.029145][ C0] a.out/6625 is trying to acquire lock:
[ 68.029530][ C0] ffff88801be0c0d0 (&q->queue_lock){..-.}-{2:2},
at: throtl_pending_timer_fn+0xf6/0x1020
[ 68.030213][ C0]
[ 68.030213][ C0] but task is already holding lock:
[ 68.030688][ C0] ffffc90000007be0
((&sq->pending_timer)){+.-.}-{0:0}, at: call_timer_fn+0xbb/0x210
[ 68.031300][ C0]
[ 68.031300][ C0] which lock already depends on the new lock.
[ 68.031300][ C0]
[ 68.031976][ C0]
[ 68.031976][ C0] the existing dependency chain (in reverse order) is:
[ 68.032548][ C0]
[ 68.032548][ C0] -> #2 ((&sq->pending_timer)){+.-.}-{0:0}:
[ 68.033086][ C0] lock_acquire+0x17f/0x430
[ 68.033418][ C0] del_timer_sync+0x104/0x380
[ 68.033764][ C0] throtl_pd_free+0x15/0x40
[ 68.034100][ C0] blkcg_deactivate_policy+0x31c/0x530
[ 68.034496][ C0] blk_throtl_exit+0x86/0x120
[ 68.034838][ C0] blkcg_init_queue+0x25a/0x2d0
[ 68.035184][ C0] __alloc_disk_node+0x2ce/0x590
[ 68.035537][ C0] __blk_mq_alloc_disk+0x11b/0x1e0
[ 68.035907][ C0] loop_add+0x340/0x9b0
[ 68.036225][ C0] loop_control_ioctl+0x108/0x770
[ 68.036587][ C0] __se_sys_ioctl+0xfb/0x170
[ 68.036927][ C0] do_syscall_64+0x3d/0x90
[ 68.037252][ C0] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 68.037668][ C0]
[ 68.037668][ C0] -> #1 (&blkcg->lock){....}-{2:2}:
[ 68.038152][ C0] lock_acquire+0x17f/0x430
[ 68.038497][ C0] _raw_spin_lock+0x2a/0x40
[ 68.038827][ C0] blkg_create+0x949/0x10a0
[ 68.039165][ C0] blkcg_init_queue+0xb4/0x2d0
[ 68.039517][ C0] __alloc_disk_node+0x2ce/0x590
[ 68.039868][ C0] __blk_mq_alloc_disk+0x11b/0x1e0
[ 68.040232][ C0] floppy_alloc_disk+0x54/0x350
[ 68.040585][ C0] do_floppy_init+0x1b1/0x1d27
[ 68.040927][ C0] async_run_entry_fn+0xa6/0x400
[ 68.041281][ C0] process_one_work+0x83c/0x11a0
[ 68.041646][ C0] worker_thread+0xa6c/0x1290
[ 68.041980][ C0] kthread+0x266/0x300
[ 68.042274][ C0] ret_from_fork+0x1f/0x30
[ 68.042592][ C0]
[ 68.042592][ C0] -> #0 (&q->queue_lock){..-.}-{2:2}:
[ 68.043074][ C0] check_prevs_add+0x4f5/0x5d30
[ 68.043433][ C0] __lock_acquire+0x4432/0x6080
[ 68.043783][ C0] lock_acquire+0x17f/0x430
[ 68.044113][ C0] _raw_spin_lock_irq+0xae/0xf0
[ 68.044465][ C0] throtl_pending_timer_fn+0xf6/0x1020
[ 68.044867][ C0] call_timer_fn+0xf5/0x210
[ 68.045189][ C0] __run_timers+0x762/0x970
[ 68.045534][ C0] run_timer_softirq+0x63/0xf0
[ 68.045890][ C0] __do_softirq+0x372/0x783
[ 68.046223][ C0] __irq_exit_rcu+0xcf/0x150
[ 68.046557][ C0] irq_exit_rcu+0x5/0x20
[ 68.046868][ C0] sysvec_apic_timer_interrupt+0x91/0xb0
[ 68.047281][ C0] asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 68.047704][ C0] should_fail+0x169/0x4f0
[ 68.048022][ C0] should_failslab+0x5/0x20
[ 68.048346][ C0] kmem_cache_alloc_lru+0x75/0x2f0
[ 68.048718][ C0] new_inode_pseudo+0x81/0x1d0
[ 68.049055][ C0] new_inode+0x25/0x1d0
[ 68.049355][ C0] __debugfs_create_file+0x146/0x550
[ 68.049723][ C0] blk_mq_debugfs_register_hctx+0x21c/0x660
[ 68.050166][ C0] blk_mq_debugfs_register+0x2e0/0x470
[ 68.050553][ C0] blk_register_queue+0x24f/0x3c0
[ 68.050912][ C0] device_add_disk+0x55a/0xc00
[ 68.051257][ C0] loop_add+0x71a/0x9b0
[ 68.051566][ C0] loop_control_ioctl+0x108/0x770
[ 68.051933][ C0] __se_sys_ioctl+0xfb/0x170
[ 68.052266][ C0] do_syscall_64+0x3d/0x90
[ 68.052591][ C0] entry_SYSCALL_64_after_hwframe+0x63/0xcd

It seems to me this is false positive, lock dep is confused about lock
from different device. #0 #1 is from loop. while #2 is from floppy.

Thanks,
Kuai