Re: [PATCH RFC v2 03/18] scsi: core: Implement reserved command handling

From: Damien Le Moal
Date: Mon Jun 20 2022 - 05:02:43 EST


On 6/20/22 17:27, Hannes Reinecke wrote:
> On 6/16/22 10:41, Damien Le Moal wrote:
>> On 2022/06/16 17:24, John Garry wrote:
>>> On 16/06/2022 03:47, Damien Le Moal wrote:
>>>>>> so going backward several years... That internal tag for ATA does not
>>>>>> need to be reserved since this command is always used when the drive is
>>>>>> idle and no other NCQ commands are on-going.
>>>>>
>>>>> So do you mean that ATA_TAG_INTERNAL qc is used for other commands
>>>>> apart from internal commands?
>>>>
>>>> No. It is used only for internal commands. What I meant to say is that
>>>> currently, internal commands are issued only on device scan, device
>>>> revalidate and error handling. All of these phases are done with the
>>>> device under EH with the issuing path stopped and all commands
>>>> completed,
>>>
>>> If I want to allocate a request for an ATA internal command then could I
>>> use 1x from the regular tags? I didn't think that this was possible as I
>>> thought that all tags may be outstanding when EH kicks in. I need to
>>> double check it.
>>
>> When EH kicks in, the drive is in error mode and all commands are back to the
>> host. From there, you need to get the drive out of error mode with read log 10h
>> and then internal commands can be issued if needed. Then the aborted commands
>> that are not in error are restarted.
>>
>> For the non-error case (revalidate), ap->ops->qc_defer() will make sure that NCQ
>> and non-NCQ commands are never mixed. Since all internal commands are non-ncq,
>> when an internal command is issued, there are necessarily no other commands
>> ongoing, but 32 NCQ commands may be waiting, without any free tag. The internal
>> command being non-NCQ can still proceed since it does not need a real device tag.
>>
>> The joy of ATA...
>>
>>> Even if it were true, not using a reserved tag for ATA internal command
>>> makes things more tricky as this command requires special handling for
>>> scsi blk_mq_ops and there is no easy way to identify the command as
>>> reserved (to know special handling is required).
>>
>> Yes. Having the ATA_TAG_INTERNAL tag as a reserved tag is fine. But from the
>> above, you can see that this is not really needed at all to make things work.
>> The management of ATA_TAG_INTERNAL as a reserve tag is really about getting your
>> API to simplify the code.
>>
>> What I am thinking is that with your patches as is, it seems that we can never
>> actually reserve a real tag for ATA to do internal NCQ commands... We do not
>> really need that for now though, apart maybe for speeding up device revalidate.
>> Everytime that one runs, one can see a big spike in read/write IO latencies
>> because of the queue drain it causes.
>>
> Hmm. But doesn't that mean the we can reserve one tag, _and_ set the
> queue depth to '32'?
> We'll need to fiddle with the tag map on completion (cf my previous
> mail), but then we might need to do that anyway if we separate out
> ATA_QCFLAG_INTERNAL ...

Reserving a tag is not enough. As explained, even if I can get a tag for a
qc, I need a proper req to safely issue an ncq command (because of the
potential need to delay and requeue even if we have a free tag !).

So reserving a tag/req to be able to do NCQ at the cost of max qd being 31
works for that. We could keep max qd at 32 by creating one more "fake" tag
and having a request for it, that is, having the fake tag visible to the
block layer as a reserved tag, as John's series is doing, but for the
reserved tags, we actually need to use an effective tag (qc->hw_tag) when
issuing the commands. And for that, we can reuse the tag of one of the
failed commands.

>
> Cheers,
>
> Hannes


--
Damien Le Moal
Western Digital Research