Re: [PATCH v1] scsi: ufs: core: Process abort completed command in MCQ mode

From: Bart Van Assche
Date: Thu Nov 02 2023 - 15:36:53 EST



On 11/1/23 21:07, hoyoung seo wrote:
when the ufs host receives any error, the ufs driver executes the error-hander.
If the error-hendler attempts re-init, it must abort and organize unprocessed
requests.
The above operation is the same for both MCQ/legacy mode.
However, in the MCQ mode, if b or c is included in the following specs,
the OCS is updated to aborted, which is different from the legacy mode.

B. If the command is in the Submission Queue and not issued to the device yet,
the host controller will mark the command to be skipped in the Submission Queue.
The host controller will post to the Completion Queue to update the OCS field
with ‘ABORTED’.
C. If the command is issued to the device already but there is no response yet
from the device, the host software driver issue the Abort task management function
to the device for that command.
Then the host driver set SQRTCy.ICU as ‘1’ to initiate the clean up the hardware
resources. The host controller will post to the Completion Queue to update the OCS
field with ‘ABORTED’.

Unlike legacy mode, this phenomenon causes unintended behavior. (As shown in the log below)

[1: kworker/u20:2:23157] ufshcd_try_to_abort_task: cmd pending in the device. tag = 9
[3: kworker/u20:2:23157] Aborting tag 9 / CDB 0x2a succeeded
[4: swapper/4: 0] sd 0:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s // DID_ABORT
[4: swapper/4: 0] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00 00 d3 02 00 00 01 00 00
[4: swapper/4: 0] I/O error, dev sda, sector 110628864 op 0x1:(WRITE) flags 0x800 phys_seg 256 prio class 2


For commands that have completed the abort operation in MCQ mode,
since OCS has been updated to aborted, it seems that it will be retransmitted only
when it is made to REQUEUE.

Hi Hoyoung,

Thank you for having provided this clarification - this really helps.

Regarding (B): I would appreciate it if this patch would be reworked
such that no new 'if (is_mcq_enabled(hba))' statements are introduced.
Has it been considered to modify ufshcd_mcq_sqe_search() such that it
sets the SCSI result to DID_REQUEUE << 16 instead of modifying
ufshcd_transfer_rsp_status()?

Regarding (C): SQRTCy.ICU is only set by ufshcd_mcq_sq_cleanup() and the only caller of that function is ufshcd_clear_cmd().
There is only one function that calls ufshcd_clear_cmd() for SCSI
commands, namely ufshcd_eh_device_reset_handler(). The latter function
should not set the SCSI result code. All it should do is to abort all
pending commands. The SCSI error handler will resubmit all aborted commands.

Thanks,

Bart.