Re: [PATCH v2 2/6] scsi: libsas: Add sas_ata_device_link_abort()

From: John Garry
Date: Fri Sep 02 2022 - 12:19:58 EST


Hi Damien,


But the pm8001 manual and current driver indicate that the
OPC_INB_SATA_ABORT command should be sent after read log ext when
handling NCQ error, regardless of an autopsy. I send OPC_INB_SATA_ABORT
in ata_eh_reset() -> pm8001_I_T_nexus_reset() -> pm8001_send_abort_all()
You lost me: ata_eh_recover() will call ata_eh_reset() only if the ATA_EH_RESET
action flag is set. So are you saying that even though it is not needed, you
still need to set ATA_EH_RESET for pm8001 ?

As below, it was the only location I found suitable to call pm8001_send_abort_all().

However I am not really sure it is required now. For pm8001 NCQ error handling we require 2x steps:
- read log ext
- Send OPC_INB_SATA_ABORT - we do this in pm8001_send_abort_all()

pm8001_send_abort_all() sends OPC_INB_SATA_ABORT in "device abort all" mode, meaning any IO in the HBA is aborted for the device. But we are also earlier in EH sending OPC_INB_SATA_ABORT for individual IOs in sas_eh_handle_sas_errors() -> sas_scsi_find_task() -> pm8001_abort_task() -> sas_execute_internal_abort_single() -> ... send_abort_task()

So I don't think that the pm8001_send_abort_all() call has any effect, as we're already aborting any outstanding IO earlier.

Admittedly the order of the 2x steps is different, but OPC_INB_SATA_ABORT does not send any protocol message to the disk, so would not affect anything subsequently read with read log ext.

Having said all that, it may be wise to still send pm8001_send_abort_all()..

Have you had a chance to consider all this yet?

I am now starting to think that it is not necessary to call pm8001_send_abort_all(), and instead rely only on sas_eh_handle_sas_errors() -> sas_scsi_find_task() -> pm8001_abort_task() -> sas_execute_internal_abort_single() -> ... -> send_abort_task() to abort each outstanding IO. Then we would not have the issue of forcing the reset in $subject (to lead to calling pm8001_send_abort_all()).

This idea could simply be tested by stubbing pm8001_send_abort_all()
(and dropping the |= ATA_EH_RESET in $subject).

Thanks,
John