Re: Report in ata_scsi_port_error_handler()

From: Damien Le Moal
Date: Wed Feb 16 2022 - 02:12:58 EST


On 2/16/22 13:16, Byungchul Park wrote:
> [ 2.051040] ===================================================
> [ 2.051406] DEPT: Circular dependency has been detected.
> [ 2.051730] 5.17.0-rc1-00014-gcf3441bb2012 #2 Tainted: G W
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] summary
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] *** DEADLOCK ***
> [ 2.051991]
> [ 2.051991] context A
> [ 2.051991] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991] [W] __raw_spin_lock_irq(&host->lock:0)
> [ 2.051991] [E] event(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991]
> [ 2.051991] context B
> [ 2.051991] [S] __raw_spin_lock_irqsave(&host->lock:0)
> [ 2.051991] [W] wait(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991] [E] spin_unlock(&host->lock:0)

Sleeping with a spinlock held would be triggering warnings already, so
these reports seem bogus to me.

In any case, trying to make sense of these reports is not easy, to say
the least. If you think there is really a bug, then please send a clear
explanation of your findings rather than a raw tool output that did not
go through human analysis.

> [ 2.051991]
> [ 2.051991] [S]: start of the event context
> [ 2.051991] [W]: the wait blocked
> [ 2.051991] [E]: the event not reachable
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] context A's detail
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] context A
> [ 2.051991] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991] [W] __raw_spin_lock_irq(&host->lock:0)
> [ 2.051991] [E] event(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991]
> [ 2.051991] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0):
> [ 2.051991] (N/A)
> [ 2.051991]
> [ 2.051991] [W] __raw_spin_lock_irq(&host->lock:0):
> [ 2.051991] [<ffffffff8178d355>] ata_sff_flush_pio_task+0x45/0xc0
> [ 2.051991] stacktrace:
> [ 2.051991] _raw_spin_lock_irq+0x58/0x90
> [ 2.051991] ata_sff_flush_pio_task+0x45/0xc0
> [ 2.051991] ata_exec_internal_sg+0x422/0x690
> [ 2.051991] ata_do_set_mode+0x5ee/0xac0
> [ 2.051991] ata_set_mode+0xfc/0x110
> [ 2.051991] ata_eh_recover+0x1061/0x1360
> [ 2.051991] ata_do_eh+0x3f/0xa0
> [ 2.051991] ata_scsi_port_error_handler+0x432/0x740
> [ 2.051991] ata_scsi_error+0x94/0xc0
> [ 2.051991] scsi_error_handler+0x8d/0x3a0
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991]
> [ 2.051991] [E] event(&(&ap->eh_wait_q)->dmap:0):
> [ 2.051991] [<ffffffff810baa53>] __wake_up_common+0x93/0x1a0
> [ 2.051991] stacktrace:
> [ 2.051991] dept_event+0x12b/0x1f0
> [ 2.051991] __wake_up_common+0xb0/0x1a0
> [ 2.051991] __wake_up_common_lock+0x65/0x90
> [ 2.051991] ata_scsi_port_error_handler+0x67a/0x740
> [ 2.051991] ata_scsi_error+0x94/0xc0
> [ 2.051991] scsi_error_handler+0x8d/0x3a0
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] context B's detail
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] context B
> [ 2.051991] [S] __raw_spin_lock_irqsave(&host->lock:0)
> [ 2.051991] [W] wait(&(&ap->eh_wait_q)->dmap:0)
> [ 2.051991] [E] spin_unlock(&host->lock:0)
> [ 2.051991]
> [ 2.051991] [S] __raw_spin_lock_irqsave(&host->lock:0):
> [ 2.051991] [<ffffffff8178047c>] ata_port_wait_eh+0x6c/0xc0
> [ 2.051991] stacktrace:
> [ 2.051991] _raw_spin_lock_irqsave+0x82/0xa0
> [ 2.051991] ata_port_wait_eh+0x6c/0xc0
> [ 2.051991] ata_port_probe+0x1f/0x30
> [ 2.051991] async_port_probe+0x27/0x50
> [ 2.051991] async_run_entry_fn+0x21/0xa0
> [ 2.051991] process_one_work+0x317/0x640
> [ 2.051991] worker_thread+0x44/0x410
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991]
> [ 2.051991] [W] wait(&(&ap->eh_wait_q)->dmap:0):
> [ 2.051991] [<ffffffff810bb017>] prepare_to_wait+0x47/0xd0
> [ 2.051991] stacktrace:
> [ 2.051991] ata_port_wait_eh+0x52/0xc0
> [ 2.051991] ata_port_probe+0x1f/0x30
> [ 2.051991] async_port_probe+0x27/0x50
> [ 2.051991] async_run_entry_fn+0x21/0xa0
> [ 2.051991] process_one_work+0x317/0x640
> [ 2.051991] worker_thread+0x44/0x410
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991]
> [ 2.051991] [E] spin_unlock(&host->lock:0):
> [ 2.051991] [<ffffffff8178046e>] ata_port_wait_eh+0x5e/0xc0
> [ 2.051991] stacktrace:
> [ 2.051991] _raw_spin_unlock_irqrestore+0x35/0x80
> [ 2.051991] ata_port_wait_eh+0x5e/0xc0
> [ 2.051991] ata_port_probe+0x1f/0x30
> [ 2.051991] async_port_probe+0x27/0x50
> [ 2.051991] async_run_entry_fn+0x21/0xa0
> [ 2.051991] process_one_work+0x317/0x640
> [ 2.051991] worker_thread+0x44/0x410
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] information that might be helpful
> [ 2.051991] ---------------------------------------------------
> [ 2.051991] CPU: 1 PID: 53 Comm: scsi_eh_1 Tainted: G W 5.17.0-rc1-00014-gcf3441bb2012 #2
> [ 2.051991] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> [ 2.051991] Call Trace:
> [ 2.051991] <TASK>
> [ 2.051991] dump_stack_lvl+0x44/0x57
> [ 2.051991] print_circle+0x384/0x510
> [ 2.051991] ? print_circle+0x510/0x510
> [ 2.051991] cb_check_dl+0x58/0x60
> [ 2.051991] bfs+0xdc/0x1b0
> [ 2.051991] add_dep+0x94/0x120
> [ 2.051991] do_event.isra.22+0x284/0x300
> [ 2.051991] ? __wake_up_common+0x93/0x1a0
> [ 2.051991] dept_event+0x12b/0x1f0
> [ 2.051991] __wake_up_common+0xb0/0x1a0
> [ 2.051991] __wake_up_common_lock+0x65/0x90
> [ 2.051991] ata_scsi_port_error_handler+0x67a/0x740
> [ 2.051991] ? trace_hardirqs_on+0x38/0xe0
> [ 2.051991] ? scsi_eh_get_sense+0x150/0x150
> [ 2.051991] ata_scsi_error+0x94/0xc0
> [ 2.051991] scsi_error_handler+0x8d/0x3a0
> [ 2.051991] ? _raw_spin_unlock_irqrestore+0x63/0x80
> [ 2.051991] ? scsi_eh_get_sense+0x150/0x150
> [ 2.051991] kthread+0xe3/0x110
> [ 2.051991] ? kthread_complete_and_exit+0x20/0x20
> [ 2.051991] ret_from_fork+0x22/0x30
> [ 2.051991] </TASK>


--
Damien Le Moal
Western Digital Research