Re: [RFC PATCH net-next v5 2/2] net: add netmem to skb_frag_t

From: Yunsheng Lin
Date: Wed Jan 17 2024 - 04:29:00 EST


On 2024/1/16 20:16, Jason Gunthorpe wrote:
> On Tue, Jan 16, 2024 at 07:04:13PM +0800, Yunsheng Lin wrote:
>> On 2024/1/16 8:01, Jason Gunthorpe wrote:
>>> On Mon, Jan 15, 2024 at 03:23:33PM -0800, Mina Almasry wrote:
>>>>>> You did not answer my question that I asked here, and ignoring this
>>>>>> question is preventing us from making any forward progress on this
>>>>>> discussion. What do you expect or want skb_frag_page() to do when
>>>>>> there is no page in the frag?
>>>>>
>>>>> I would expect it to do nothing.
>>>>
>>>> I don't understand. skb_frag_page() with an empty implementation just
>>>> results in a compiler error as the function needs to return a page
>>>> pointer. Do you actually expect skb_frag_page() to unconditionally
>>>> cast frag->netmem to a page pointer? That was explained as
>>>> unacceptable over and over again by Jason and Christian as it risks
>>>> casting devmem to page; completely unacceptable and will get nacked.
>>>> Do you have a suggestion of what skb_frag_page() should do that will
>>>> not get nacked by mm?
>>>
>>> WARN_ON and return NULL seems reasonable?
>>
>> While I am agreed that it may be a nightmare to debug the case of passing
>> a false page into the mm system, but I am not sure what's the point of
>> returning NULL to caller if the caller is not expecting or handling
>> the
>
> You have to return something and NULL will largely reliably crash the
> thread. The WARN_ON explains in detail why your thread just crashed.
>
>> NULL returning[for example, most of mm API called by the networking does not
>> seems to handling NULL as input page], isn't the NULL returning will make
>> the kernel panic anyway? Doesn't it make more sense to just add a BUG_ON()
>> depending on some configuration like CONFIG_DEBUG_NET or CONFIG_DEVMEM?
>> As returning NULL seems to be causing a confusion for the caller of
>> skb_frag_page() as whether to or how to handle the NULL returning case.
>
> Possibly, though Linus doesn't like BUG_ON on principle..

>From the below discussion, it seems the BUG_ON belonging to something
like below:
(a) development code where it replaces error handling that you just
haven't written yet, and you haven't really thought through all the
possibilities, so you're saying "this can't happen, I'll fix it
later".

https://linux.kernel.narkive.com/WPFYQ9d4/bug-on-in-workingset-node-shadows-dec-triggers#post6

Anyway, it would be fine for me if skb_frag_page() is not used to decide
if a normal page or a devmem is in a frag. As for the overhead of catching
the calling of skb_frag_page() for devmem frag, it would be better to avoid
it using some configuration like CONFIG_DEBUG_NET or CONFIG_DEVMEM.

>
> I think the bigger challenge is convincing people that this devmem
> stuff doesn't just open a bunch of holes in the kernel where userspace
> can crash it.
>
> The fact you all are debating what to do with skb_frag_page() suggests
> to me there isn't confidence...
>
> Jason
> .
>