Re: [PATCH net-next v4 1/5] page_pool: frag API support for 32-bit arch with 64-bit DMA

From: Yunsheng Lin
Date: Thu Jun 15 2023 - 03:30:12 EST


On 2023/6/15 1:07, Jakub Kicinski wrote:
> On Wed, 14 Jun 2023 19:42:29 +0800 Yunsheng Lin wrote:
>> On 2023/6/14 12:09, Jakub Kicinski wrote:
>>> On Mon, 12 Jun 2023 21:02:52 +0800 Yunsheng Lin wrote:
>>>> Currently page_pool_alloc_frag() is not supported in 32-bit
>>>> arch with 64-bit DMA, which seems to be quite common, see
>>>> [1], which means driver may need to handle it when using
>>>> page_pool_alloc_frag() API.
>>>>
>>>> In order to simplify the driver's work for supporting page
>>>> frag, this patch allows page_pool_alloc_frag() to call
>>>> page_pool_alloc_pages() to return a big page frag without
>>>
>>> it returns an entire (potentially compound) page, not a frag.
>>> AFAICT
>>
>> As driver calls page_pool_alloc_frag(), and page_pool_alloc_frag()
>> calls page_pool_alloc_pages(), page_pool_alloc_pages() is hidden
>> inside page_pool_alloc_frag(), so it is a big page frag from driver's
>> point of view:)
>
> frag​ment : a part broken off, detached, or incomplete
> a small part broken or separated off something.
>
> "big fragment" is definitely not the whole thing.

Alexander susgested something as below:
non-fragmented - legacy page pool w/o page frags
mono-frag - after this page page pool w/o frags
fragmented - before/after this patch w/ frags

https://patchwork.kernel.org/project/netdevbpf/patch/20230529092840.40413-3-linyunsheng@xxxxxxxxxx/#25366535

Does 'mono-frag' make sense to you? or any better name in
mind?

>
>>>> page splitting because of overlap issue between pp_frag_count
>>>> and dma_addr_upper in 'struct page' for those arches.
>>>
>>> These two lines seem to belong in the first paragraph,
>>>
>>>> As page_pool_create() with PP_FLAG_PAGE_FRAG is supported in
>>>
>>> "is" -> "will now be"
>>>
>>>> 32-bit arch with 64-bit DMA now, mlx5 calls page_pool_create()
>>>> with PP_FLAG_PAGE_FRAG and manipulate the page->pp_frag_count
>>>> directly using the page_pool_defrag_page(), so add a checking
>>>> for it to aoivd writing to page->pp_frag_count that may not
>>>> exist in some arch.
>>>
>>> This paragraph needs some proof reading :(
>>
>> Perhaps something like below?
>> mlx5 calls page_pool_create() with PP_FLAG_PAGE_FRAG and is
>> not using the frag API, as PP_FLAG_PAGE_FRAG checking for arch
>> with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true will now be
>> removed in this patch, so add back the checking of
>> PAGE_POOL_DMA_USE_PP_FRAG_COUNT for mlx5 driver to retain the
>> old behavior, which is to avoid mlx5e_page_release_fragmented()
>> calling page_pool_defrag_page() to write to page->pp_frag_count.
>
> That's a 7-line long, single sentence. Not much better.

I had the same feeling when I was typing:(
Any suggestion for the adjustment?

>
>>>> Note that it may aggravate truesize underestimate problem for
>>>> skb as there is no page splitting for those pages, if driver
>>>> need a accuate truesize, it may calculate that according to
>>>
>>> accurate
>>>
>>>> frag size, page order and PAGE_POOL_DMA_USE_PP_FRAG_COUNT
>>>> being true or not. And we may provide a helper for that if it
>>>> turns out to be helpful.
>>>>
>>>> 1. https://lore.kernel.org/all/20211117075652.58299-1-linyunsheng@xxxxxxxxxx/
>>>
>>>> + /* Return error here to avoid writing to page->pp_frag_count in
>>>> + * mlx5e_page_release_fragmented() for page->pp_frag_count is
>>>
>>> I don't see any direct access to pp_frag_count anywhere outside of
>>> page_pool.h in net-next. PAGE_POOL_DMA_USE_PP_FRAG_COUNT sounds like
>>> an internal flag, drivers shouldn't be looking at it, IMO.
>>
>> mlx5e_page_release_fragmented() calls page_pool_defrag_page(), maybe
>> below is more correct:
>>
>> /* Return error here to avoid mlx5e_page_release_fragmented() calling
>> * page_pool_defrag_page() to write to page->pp_frag_count which is
>> * not usable for arch with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true.
>> */
>>
>> I am agree with you about that drivers shouldn't be looking at it. But
>> adding PAGE_POOL_DMA_USE_PP_FRAG_COUNT checking back to mlx5 seems to be
>> the simplest way I can think of because of the reason mentioned above.
>>
>> And it seems that it is hard to change mlx5 to use frag API according to
>> the below disscusion with Alexander:
>>
>> https://lore.kernel.org/all/CAKgT0UeD=sboWNUsP33_UsKEKyqTBfeOqNO5NCdFaxh9KXEG3w@xxxxxxxxxxxxxx/
>
> It's better to add a flag like PP_FLAG_PAGE_FRAG for this use case and
> have pool creation fail than poke at internals in the driver, IMO.

Jesper also had the similiar comment about that, but we decided to postpone
that because of the naming, any better name for that flag in mind?

maybe something as:
PP_FLAG_DRIVER_FRAG_HANDLING?

>
> .
>