Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device fell off

From: John Garry
Date: Tue Dec 20 2022 - 03:44:23 EST


On 19/12/2022 23:00, Damien Le Moal wrote:
But it is expected that ata_qc_issue() should be called with that the
host lock grabbed (and keep it).

I think that the reason libsas drops the lock is because some LLDD
queuecommand CBs calls task_done() in some error paths. If we kept the
lock held, then we could have a deadlock, for example:

sas_ata_qc_issue (has lock) -> lldd_execute_task() =
pm8001_queue_command() -> task_done() = sas_ata_task_done() -> grab host
lock => deadlock.
That should be easily solvable using a workqueue for doing task_done(), no ?


I don't see why we cannot just return an error code directly from the lldd_execute_task CB always - we end up calling scsi_done() directly then. But I am suspicious why it is not already done this way.

Looking at the code history, this fiddling with the ap->lock actually looks related to commit 312d3e56119a4bc5c36a96818f87f650c069ddc2 ("[SCSI] libsas: remove ata_port.lock management duties from lldds"). I will check that further.

Thanks,
John