Freezable workqueue blocks non-freezable workqueue during the system resume process

From: Peter Chen
Date: Mon Feb 22 2016 - 22:25:26 EST


Hi Tejun Heo and Florian Mickler,

I have a question that during the system resume process, the freezable
workqueue can be thawed if there is a non-freezable workqueue is
blocked (At uninterruptable state)?

My case like below, I have a USB OTG (Micro-AB) cable is at USB
Micro-B port, and there is a USB driver on it, and un-plug this
cable can wake up system from the suspend. There is a non-freezable
workqueue ci_otg will be scheduled after disconnecting OTG cable,
and in its worker ci_otg_work, it will try to disconnect USB drive,
and flush disk information. But flush disk information is done by
freezable workqueue writeback, it seeems workqueue writeback is
never got chance to execute, the workqueue ci_otg is waiting there
forever, and the system is deadlock.

Both change workqueue ci_otg as freezable or change workqueue writeback
as non-freezable can fix this problem.

The call stack like below:

[ 546.987379] writeback S c07de74c 0 12 2 0x00000000
[ 546.993804] Backtrace:
[ 546.996307] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[ 547.003370] r10:ef14bc80 r9:ef14ca00 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef14bc98
[ 547.011325] r4:ef164000
[ 547.013907] [<c07dec24>] (schedule) from [<c0045f20>] (rescuer_thread+0x290/0x308)
[ 547.021490] r4:00000000 r3:00000008
[ 547.025136] [<c0045c90>] (rescuer_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[ 547.032459] r10:00000000 r9:00000000 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef1526c0
[ 547.040412] r4:00000000
[ 547.042993] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[ 547.050229] r7:00000000 r6:00000000 r5:c004b9d8 r4:ef1526c0
[ 555.178869] kworker/u2:13 D c07de74c 0 826 2 0x00000000

[ 555.185310] Workqueue: ci_otg ci_otg_work
[ 555.189353] Backtrace:
[ 555.191849] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[ 555.198912] r10:ee471ba0 r9:00000000 r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4
[ 555.206867] r4:ee470000
[ 555.209453] [<c07dec24>] (schedule) from [<c07e2fc4>] (schedule_timeout+0x15c/0x1e0)
[ 555.217212] r4:7fffffff r3:edc2b000
[ 555.220862] [<c07e2e68>] (schedule_timeout) from [<c07df6c8>] (wait_for_common+0x94/0x144)
[ 555.229140] r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4 r4:7fffffff
[ 555.235980] [<c07df634>] (wait_for_common) from [<c07df790>] (wait_for_completion+0x18/0x1c)
[ 555.244430] r10:00000001 r9:c0b5563c r8:c0042e48 r7:ef086000 r6:eea4372c r5:ef131b00
[ 555.252383] r4:00000000
[ 555.254970] [<c07df778>] (wait_for_completion) from [<c0043cb8>] (flush_work+0x19c/0x234)
[ 555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)
[ 555.271106] r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
[ 555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
[ 555.286236] r4:eea43520 r3:20000153
[ 555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
[ 555.298250] r5:eea43808 r4:eea43400
[ 555.301909] [<c02c1fd4>] (blk_cleanup_queue) from [<c0417914>] (__scsi_remove_device+0x48/0xb8)
[ 555.310623] r7:00000000 r6:20000153 r5:ededa950 r4:ededa800
[ 555.316403] [<c04178cc>] (__scsi_remove_device) from [<c0415e90>] (scsi_forget_host+0x64/0x68)
[ 555.325028] r5:ededa800 r4:ed5b5000
[ 555.328689] [<c0415e2c>] (scsi_forget_host) from [<c0409828>] (scsi_remove_host+0x78/0x104)
[ 555.337054] r5:ed5b5068 r4:ed5b5000
[ 555.340709] [<c04097b0>] (scsi_remove_host) from [<c04cdfcc>] (usb_stor_disconnect+0x50/0xb4)
[ 555.349247] r6:ed5b56e4 r5:ed5b5818 r4:ed5b5690 r3:00000008
[ 555.355025] [<c04cdf7c>] (usb_stor_disconnect) from [<c04b3bc8>] (usb_unbind_interface+0x78/0x25c)
[ 555.363997] r8:c13919b4 r7:edd3c000 r6:edd3c020 r5:ee551c68 r4:ee551c00 r3:c04cdf7c
[ 555.371892] [<c04b3b50>] (usb_unbind_interface) from [<c03dc248>] (__device_release_driver+0x8c/0x118)
[ 555.381213] r10:00000001 r9:edd90c00 r8:c13919b4 r7:ee551c68 r6:c0b546e0 r5:c0b5563c
[ 555.389167] r4:edd3c020
[ 555.391752] [<c03dc1bc>] (__device_release_driver) from [<c03dc2fc>] (device_release_driver+0x28/0x34)
[ 555.401071] r5:edd3c020 r4:edd3c054
[ 555.404721] [<c03dc2d4>] (device_release_driver) from [<c03db304>] (bus_remove_device+0xe0/0x110)
[ 555.413607] r5:edd3c020 r4:ef17f04c
[ 555.417253] [<c03db224>] (bus_remove_device) from [<c03d8128>] (device_del+0x114/0x21c)
[ 555.425270] r6:edd3c028 r5:edd3c020 r4:ee551c00 r3:00000000
[ 555.431045] [<c03d8014>] (device_del) from [<c04b1560>] (usb_disable_device+0xa4/0x1e8)
[ 555.439061] r8:edd3c000 r7:eded8000 r6:00000000 r5:00000001 r4:ee551c00
[ 555.445906] [<c04b14bc>] (usb_disable_device) from [<c04a8e54>] (usb_disconnect+0x74/0x224)
[ 555.454271] r9:edd90c00 r8:ee551000 r7:ee551c68 r6:ee551c9c r5:ee551c00 r4:00000001
[ 555.462156] [<c04a8de0>] (usb_disconnect) from [<c04a8fb8>] (usb_disconnect+0x1d8/0x224)
[ 555.470259] r10:00000001 r9:edd90000 r8:ee471e2c r7:ee551468 r6:ee55149c r5:ee551400
[ 555.478213] r4:00000001
[ 555.480797] [<c04a8de0>] (usb_disconnect) from [<c04ae5ec>] (usb_remove_hcd+0xa0/0x1ac)
[ 555.488813] r10:00000001 r9:ee471eb0 r8:00000000 r7:ef3d9500 r6:eded810c r5:eded80b0
[ 555.496765] r4:eded8000
[ 555.499351] [<c04ae54c>] (usb_remove_hcd) from [<c04d4158>] (host_stop+0x28/0x64)
[ 555.506847] r6:eeb50010 r5:eded8000 r4:eeb51010
[ 555.511563] [<c04d4130>] (host_stop) from [<c04d09b8>] (ci_otg_work+0xc4/0x124)
[ 555.518885] r6:00000001 r5:eeb50010 r4:eeb502a0 r3:c04d4130
[ 555.524665] [<c04d08f4>] (ci_otg_work) from [<c00454f0>] (process_one_work+0x194/0x420)
[ 555.532682] r6:ef086000 r5:eeb502a0 r4:edc44480
[ 555.537393] [<c004535c>] (process_one_work) from [<c00457b0>] (worker_thread+0x34/0x514)
[ 555.545496] r10:edc44480 r9:ef086000 r8:c0b1a100 r7:ef086034 r6:00000088 r5:edc44498
[ 555.553450] r4:ef086000
[ 555.556032] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[ 555.563268] r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44480 r5:eddc15c0
[ 555.571221] r4:00000000
[ 555.573804] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[ 555.581040] r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
[ 555.586803] kworker/u2:14 S c07de74c 0 827 2 0x00000000
[ 555.593232] Backtrace:
[ 555.595739] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[ 555.602806] r10:edc44200 r9:00000000 r8:00000000 r7:ef086034 r6:edc44200 r5:ef086000
[ 555.610761] r4:ed72a000
[ 555.613345] [<c07dec24>] (schedule) from [<c0045834>] (worker_thread+0xb8/0x514)
[ 555.620755] r4:ef086000 r3:ed72bef0
[ 555.624399] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
[ 555.631635] r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44200 r5:eddc15c0
[ 555.639589] r4:00000000
[ 555.642170] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
[ 555.649406] r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
[ 555.655168]
[ 555.655168] Showing all locks held in the system:
[ 555.661424] 5 locks held by sh/694:
[ 555.664926] #0: (sb_writers#6){.+.+.+}, at: [<c0118f0c>] __sb_start_write+0xb0/0xbc
[ 555.672921] #1: (&of->mutex){+.+.+.}, at: [<c0187ad4>] kernfs_fop_write+0x60/0x1fc
[ 555.680805] #2: (s_active#211){.+.+.+}, at: [<c0187adc>] kernfs_fop_write+0x68/0x1fc
[ 555.688873] #3: (pm_mutex){+.+.+.}, at: [<c0079d10>] pm_suspend+0xd0/0x2cc
[ 555.696052] #4: (&dev->mutex){......}, at: [<c03e927c>] dpm_complete+0xc0/0x1b0
[ 555.703684] 7 locks held by kworker/u2:13/826:
[ 555.708140] #0: ("%s""ci_otg"){++++.+}, at: [<c0045484>] process_one_work+0x128/0x420
[ 555.716277] #1: ((&ci->work)){+.+.+.}, at: [<c0045484>] process_one_work+0x128/0x420
[ 555.724317] #2: (usb_bus_list_lock){+.+.+.}, at: [<c04ae5e4>] usb_remove_hcd+0x98/0x1ac
[ 555.732626] #3: (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[ 555.740403] #4: (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
[ 555.748179] #5: (&dev->mutex){......}, at: [<c03dc2f4>] device_release_driver+0x20/0x34
[ 555.756487] #6: (&shost->scan_mutex){+.+.+.}, at: [<c04097d0>] scsi_remove_host+0x20/0x104
[ 555.765062]
[ 555.766567] =============================================
[ 555.766567]
[ 555.773467] Showing busy workqueues and worker pools:
[ 555.778544] workqueue events_freezable: flags=0x4
[ 555.783371] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[ 555.789316] delayed: thermal_zone_device_check
[ 555.794191] workqueue pm: flags=0x4
[ 555.797700] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[ 555.803636] delayed: pm_runtime_work
[ 555.807629] workqueue writeback: flags=0x4e
[ 555.811894] pwq 2: cpus=0 flags=0x4 nice=0 active=0/0
[ 555.817209] delayed: wb_workfn BAR(826)
[ 555.821493] workqueue usb_hub_wq: flags=0x4
[ 555.825693] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
[ 555.831627] delayed: hub_event
[ 555.835119] workqueue ci_otg: flags=0x2000a
[ 555.839320] pwq 2: cpus=0 flags=0x4 nice=0 active=1/1
[ 555.844623] in-flight: 826:ci_otg_work
[ 555.848810] pool 2: cpus=0 flags=0x4 nice=0 hung=0s workers=15 idle: 814 733 700 824 6 822 820 819 816 817 815 813 827 825

Thanks you!

--

Best Regards,
Peter Chen