RE: [PATCH v1] scsi: ufs: core: Process abort completed command in MCQ mode

From: hoyoung seo
Date: Thu Nov 02 2023 - 00:07:51 EST


> -----Original Message-----
> From: Bart Van Assche <bvanassche@xxxxxxx>
> Sent: Thursday, November 2, 2023 1:39 AM
> To: SEO HOYOUNG <hy50.seo@xxxxxxxxxxx>; linux-scsi@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; alim.akhtar@xxxxxxxxxxx; avri.altman@xxxxxxx;
> jejb@xxxxxxxxxxxxx; martin.petersen@xxxxxxxxxx; beanhuo@xxxxxxxxxx;
> kwangwon.min@xxxxxxxxxxx; kwmad.kim@xxxxxxxxxxx; sh425.lee@xxxxxxxxxxx;
> sc.suh@xxxxxxxxxxx; quic_nguyenb@xxxxxxxxxxx; cpgs@xxxxxxxxxxx
> Subject: Re: [PATCH v1] scsi: ufs: core: Process abort completed command
> in MCQ mode
>
> On 11/1/23 01:45, SEO HOYOUNG wrote:
> > In MCQ mode, the case where OCS is updated to aborted is as follows
> > 1. when abort processing is completed
> > 2. When a duplicate command occurs
>
> What is a "duplicate command"? The UFSHCI driver guarantees that each SCSI
> command has a unique tag.
>
> > In case of 1 situation, cmd should be re-request.
>
> It should be resubmitted by the SCSI error handler. The UFSHCI driver does
> not have to request this explicitly. See also the code at the end of
> scmd_eh_abort_handler().
>
> > case OCS_ABORTED:
> > - result |= DID_ABORT << 16;
> > + if (cqe)
> > + eec = le32_to_cpu(cqe->status) & MASK_EEC;
> > +
> > + if (is_mcq_enabled(hba) && !eec)
> > + result |= DID_REQUEUE << 16;
> > + else
> > + result |= DID_ABORT << 16;
> > break;
>
> I don't think this change is necessary. Additionally, introducing
> different behavior for MCQ compared to legacy mode in this code path is
> suspicious. Why should how commands are queued affect how aborts are
> processed?
>
> Thanks,
>
> Bart.

when the ufs host receives any error, the ufs driver executes the error-hander.
If the error-hendler attempts re-init, it must abort and organize unprocessed
requests.
The above operation is the same for both MCQ/legacy mode.
However, in the MCQ mode, if b or c is included in the following specs,
the OCS is updated to aborted, which is different from the legacy mode.

B. If the command is in the Submission Queue and not issued to the device yet,
the host controller will mark the command to be skipped in the Submission Queue.
The host controller will post to the Completion Queue to update the OCS field
with ‘ABORTED’.
C. If the command is issued to the device already but there is no response yet
from the device, the host software driver issue the Abort task management function
to the device for that command.
Then the host driver set SQRTCy.ICU as ‘1’ to initiate the clean up the hardware
resources. The host controller will post to the Completion Queue to update the OCS
field with ‘ABORTED’.

Unlike legacy mode, this phenomenon causes unintended behavior. (As shown in the log below)

[1: kworker/u20:2:23157] ufshcd_try_to_abort_task: cmd pending in the device. tag = 9
[3: kworker/u20:2:23157] Aborting tag 9 / CDB 0x2a succeeded
[4: swapper/4: 0] sd 0:0:0:0: [sda] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s // DID_ABORT
[4: swapper/4: 0] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x2a 2a 00 00 d3 02 00 00 01 00 00
[4: swapper/4: 0] I/O error, dev sda, sector 110628864 op 0x1:(WRITE) flags 0x800 phys_seg 256 prio class 2


For commands that have completed the abort operation in MCQ mode,
since OCS has been updated to aborted, it seems that it will be retransmitted only
when it is made to REQUEUE.

Thanks.

SEO.