Re: [PATCH RFC net-next 2/4] net: page_pool: avoid calling no-op externals when possible

From: Alexander Lobakin
Date: Fri Jun 30 2023 - 08:30:49 EST


From: Alexander H Duyck <alexander.duyck@xxxxxxxxx>
Date: Thu, 29 Jun 2023 09:45:26 -0700

> On Thu, 2023-06-29 at 17:23 +0200, Alexander Lobakin wrote:
>> Turned out page_pool_put{,_full}_page() can burn quite a bunch of cycles
>> even when on DMA-coherent platforms (like x86) with no active IOMMU or
>> swiotlb, just for the call ladder.
>> Indeed, it's

[...]

>> @@ -341,6 +345,12 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
>>
>> page_pool_set_dma_addr(page, dma);
>>
>> + if ((pool->p.flags & PP_FLAG_DMA_MAYBE_SYNC) &&
>> + dma_need_sync(pool->p.dev, dma)) {
>> + pool->p.flags |= PP_FLAG_DMA_SYNC_DEV;
>> + pool->p.flags &= ~PP_FLAG_DMA_MAYBE_SYNC;
>> + }
>> +
>> if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
>> page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
>>
>
> I am pretty sure the logic is flawed here. The problem is
> dma_needs_sync depends on the DMA address being used. In the worst case
> scenario we could have a device that has something like a 32b DMA
> address space on a system with over 4GB of memory. In such a case the
> higher addresses would need to be synced because they will go off to a
> swiotlb bounce buffer while the lower addresses wouldn't.
>
> If you were to store a flag like this it would have to be generated per
> page.

I know when DMA might need syncing :D That's the point of this shortcut:
if at least one page needs syncing, I disable it for the whole pool.
It's a "better safe than sorry".
Using a per-page flag involves more changes and might hurt some
scenarios/setups. For example, non-coherent systems, where you always
need to do syncs. The idea was to give some improvement when possible,
otherwise just fallback to what we have today.

Thanks,
Olek