Re: [RFC RESEND 16/16] nvme-pci: use blk_rq_dma_map() for NVMe SGL

From: Chaitanya Kulkarni
Date: Tue Mar 05 2024 - 11:47:03 EST


On 3/5/24 08:39, Chaitanya Kulkarni wrote:
> On 3/5/24 08:08, Jens Axboe wrote:
>> On 3/5/24 8:51 AM, Keith Busch wrote:
>>> On Tue, Mar 05, 2024 at 01:18:47PM +0200, Leon Romanovsky wrote:
>>>> @@ -236,7 +236,9 @@ struct nvme_iod {
>>>> unsigned int dma_len; /* length of single DMA segment mapping */
>>>> dma_addr_t first_dma;
>>>> dma_addr_t meta_dma;
>>>> - struct sg_table sgt;
>>>> + struct dma_iova_attrs iova;
>>>> + dma_addr_t dma_link_address[128];
>>>> + u16 nr_dma_link_address;
>>>> union nvme_descriptor list[NVME_MAX_NR_ALLOCATIONS];
>>>> };
>>> That's quite a lot of space to add to the iod. We preallocate one for
>>> every request, and there could be millions of them.
>> Yeah, that's just a complete non-starter. As far as I can tell, this
>> ends up adding 1052 bytes per request. Doing the quick math on my test
>> box (24 drives), that's just a smidge over 3GB of extra memory. That's
>> not going to work, not even close.
>>
> I don't have any intent to use more space for the nvme_iod than what
> it is now. I'll trim down the iod structure and send out a patch soon with
> this fixed to continue the discussion here on this thread ...
>
> -ck
>
>

For final version when DMA API is discussion is concluded, I've plan to use
the iod_mempool for allocation of nvme_iod->dma_link_address, however I'
not wait for that and send out a updated version with trimmed nvme_iod size.

If you guys have any other comments please let me know or we can
continue the
discussion on once I post new version of this patch on this thread ...

Thanks a log Keith and Jens for looking into it ...

-ck