Re: [PATCH v2 01/17] ibmvfc: add vhost fields and defaults for MQ enablement

From: Tyrel Datwyler
Date: Tue Dec 08 2020 - 17:38:33 EST


On 12/7/20 3:56 AM, Hannes Reinecke wrote:
> On 12/4/20 3:26 PM, Brian King wrote:
>> On 12/2/20 11:27 AM, Tyrel Datwyler wrote:
>>> On 12/2/20 7:14 AM, Brian King wrote:
>>>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>>>>> Introduce several new vhost fields for managing MQ state of the adapter
>>>>> as well as initial defaults for MQ enablement.
>>>>>
>>>>> Signed-off-by: Tyrel Datwyler <tyreld@xxxxxxxxxxxxx>
>>>>> ---
>>>>>   drivers/scsi/ibmvscsi/ibmvfc.c |  9 ++++++++-
>>>>>   drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>>>>>   2 files changed, 19 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> index 42e4d35e0d35..f1d677a7423d 100644
>>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const
>>>>> struct vio_device_id *id)
>>>>>       }
>>>>>         shost->transportt = ibmvfc_transport_template;
>>>>> -    shost->can_queue = max_requests;
>>>>> +    shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>>>>
>>>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ
>>>> queue depth.
>>>
>>> Our max_requests is the total number commands allowed across all queues. From
>>> what I understand is can_queue is the total number of commands in flight allowed
>>> for each hw queue.
>>>
>>>          /*
>>>           * In scsi-mq mode, the number of hardware queues supported by the LLD.
>>>           *
>>>           * Note: it is assumed that each hardware queue has a queue depth of
>>>           * can_queue. In other words, the total queue depth per host
>>>           * is nr_hw_queues * can_queue. However, for when host_tagset is set,
>>>           * the total queue depth is can_queue.
>>>           */
>>>
>>> We currently don't use the host wide shared tagset.
>>
>> Ok. I missed that bit... In that case, since we allocate by default only 100
>> event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then
>> we end up with only about 6 commands that can be outstanding per queue,
>> which is going to really hurt performance... I'd suggest bumping up
>> IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point.
>>
> Before doing that I'd rather use the host-wide shared tagset.
> Increasing the number of requests will increase the memory footprint of the
> driver (as each request will be statically allocated).
>

In the case where we use host-wide how do I determine the queue depth per
hardware queue? Is is hypothetically can_queue or is it (can_queue /
nr_hw_queues)? We want to allocate an event pool per-queue which made sense
without host-wide tags since the queue depth per hw queue is exactly can_queue.

-Tyrel