Re: [PATCH v2] driver core: Fix bus_type.match() error handling

From: Bart Van Assche
Date: Sun Aug 21 2022 - 17:40:28 EST


On 8/20/22 04:48, Guenter Roeck wrote:
INFO: task init:283 blocked for more than 122 seconds.
Tainted: G N 6.0.0-rc1-00303-g963a70bee588 #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:init state:D stack: 0 pid: 283 ppid: 1 flags:0x00000000
__schedule from schedule+0x70/0x118
schedule from scsi_remove_host+0x178/0x1c4
scsi_remove_host from usb_stor_disconnect+0x40/0xe8
usb_stor_disconnect from usb_unbind_interface+0x78/0x274
usb_unbind_interface from device_release_driver_internal+0x1a4/0x230
device_release_driver_internal from bus_remove_device+0xd0/0x100
bus_remove_device from device_del+0x174/0x3ec
device_del from usb_disable_device+0xcc/0x178
usb_disable_device from usb_disconnect+0xcc/0x274
usb_disconnect from usb_disconnect+0x98/0x274
usb_disconnect from usb_remove_hcd+0xd0/0x16c
usb_remove_hcd from host_stop+0x38/0xa8
host_stop from ci_hdrc_remove+0x40/0x134
ci_hdrc_remove from platform_remove+0x24/0x54
platform_remove from device_release_driver_internal+0x1a4/0x230
device_release_driver_internal from bus_remove_device+0xd0/0x100
bus_remove_device from device_del+0x174/0x3ec
device_del from platform_device_del.part.0+0x10/0x78
platform_device_del.part.0 from platform_device_unregister+0x18/0x28
platform_device_unregister from ci_hdrc_remove_device+0xc/0x24
ci_hdrc_remove_device from ci_hdrc_imx_remove+0x28/0xfc
ci_hdrc_imx_remove from device_shutdown+0x178/0x230
device_shutdown from __do_sys_reboot+0x168/0x258
__do_sys_reboot from ret_fast_syscall+0x0/0x1c

Hi Guenter,

Thank you for having shared this information. I think this deadlock is the result of holding a reference on /dev/sda (by the mount() system call) while calling scsi_remove_host().

It seems wrong to me that ci_hdrc_imx_shutdown() calls ci_hdrc_imx_remove() - I think that function should only do the minimum that is required to prepare for shutdown instead of calling scsi_remove_host() indirectly.

That being said, the patch series "scsi: core: Call blk_mq_free_tag_set() earlier" probably will have to be reverted because of the following deadlock reported by syzbot: https://lore.kernel.org/linux-scsi/000000000000b5187d05e6c08086@xxxxxxxxxx/

Thanks,

Bart.