Deadlock in rfcomm_sk_state_change

From: Ilia Mirkin
Date: Sun Jun 19 2022 - 23:28:51 EST


Hi all,

It appears that this deadlock has been reported a few times before:

BZ here: https://bugzilla.kernel.org/show_bug.cgi?id=215746
Patch here: https://lore.kernel.org/all/20211004180734.434511-1-desmondcheongzx@xxxxxxxxx/

A Google search turns up a few other instances too.

This is the deadlock I ran into, on a ThinkPad T420s with kernel
v5.18.5. I never ran into this with the kernel I previously had on
here, v5.7.8.

[ 1513.564806] task:krfcommd state:D stack:14824 pid: 571
ppid: 2 flags:0x00004000
[ 1513.564833] Call Trace:
[ 1513.564838] <TASK>
[ 1513.564843] __schedule+0x27a/0x1050
[ 1513.564861] schedule+0x46/0xb0
[ 1513.564867] schedule_preempt_disabled+0xc/0x20
[ 1513.564875] __mutex_lock.constprop.0+0x284/0x4b0
[ 1513.564884] rfcomm_run+0x14d/0x1340
[ 1513.564895] ? swake_up_all+0xe0/0xe0
[ 1513.564908] ? rfcomm_check_accept+0xd0/0xd0
[ 1513.564919] kthread+0xd4/0x100
[ 1513.564930] ? kthread_complete_and_exit+0x20/0x20
[ 1513.564940] ret_from_fork+0x22/0x30
[ 1513.564955] </TASK>
[ 1513.564968] task:bluetoothd state:D stack:13248 pid: 4917
ppid: 1 flags:0x00000004
[ 1513.564987] Call Trace:
[ 1513.564990] <TASK>
[ 1513.564994] __schedule+0x27a/0x1050
[ 1513.565004] ? eventfd_read+0xda/0x280
[ 1513.565020] schedule+0x46/0xb0
[ 1513.565028] __lock_sock+0x74/0xc0
[ 1513.565042] ? destroy_sched_domains_rcu+0x30/0x30
[ 1513.565055] lock_sock_nested+0x3f/0x50
[ 1513.565065] rfcomm_sk_state_change+0x20/0x100
[ 1513.565078] __rfcomm_dlc_close+0x8d/0x1a0
[ 1513.565088] rfcomm_dlc_close+0x66/0x90
[ 1513.565098] __rfcomm_sock_close+0x30/0xf0
[ 1513.565109] rfcomm_sock_shutdown+0x4a/0x80
[ 1513.565122] rfcomm_sock_release+0x22/0x90
[ 1513.565133] __sock_release+0x38/0xb0
[ 1513.565146] sock_close+0xc/0x20
[ 1513.565157] __fput+0x87/0x240
[ 1513.565172] task_work_run+0x57/0x90
[ 1513.565190] exit_to_user_mode_prepare+0x108/0x110
[ 1513.565206] syscall_exit_to_user_mode+0x1d/0x50
[ 1513.565224] ? __x64_sys_close+0x8/0x40
[ 1513.565239] do_syscall_64+0x69/0xc0
[ 1513.565253] ? __x64_sys_close+0x8/0x40
[ 1513.565271] ? do_syscall_64+0x69/0xc0
[ 1513.565276] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1513.565283] RIP: 0033:0x7f818c753883
[ 1513.565287] RSP: 002b:00007ffd9222fe78 EFLAGS: 00000246 ORIG_RAX:
0000000000000003
[ 1513.565292] RAX: 0000000000000000 RBX: 000056007127afd0 RCX: 00007f818c753883
[ 1513.565295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016
[ 1513.565297] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1513.565299] R10: 0000000000000026 R11: 0000000000000246 R12: 0000000000000000
[ 1513.565302] R13: 0000000000000001 R14: 000056007125af84 R15: 000056007125af9c
[ 1513.565306] </TASK>

However it doesn't look like the patch has been applied (at least in
Linus's current tree), nor does there appear to be any motion on the
BZ-filed issue. Happy to provide any additional information, just let
me know what you need.

Cheers,

Ilia Mirkin
imirkin@xxxxxxxxxxxx