Re: [PATCH net-next 9/9] net: skbuff: always try to recycle PP pages directly when in softirq

From: Jesper Dangaard Brouer
Date: Fri Jul 28 2023 - 05:33:41 EST




On 27/07/2023 16.43, Alexander Lobakin wrote:
Commit 8c48eea3adf3 ("page_pool: allow caching from safely localized
NAPI") allowed direct recycling of skb pages to their PP for some cases,
but unfortunately missed a couple of other majors.
For example, %XDP_DROP in skb mode. The netstack just calls kfree_skb(),
which unconditionally passes `false` as @napi_safe. Thus, all pages go
through ptr_ring and locks, although most of time we're actually inside
the NAPI polling this PP is linked with, so that it would be perfectly
safe to recycle pages directly.

The commit messages is hard to read. It would help me as the reader if
you used a empty line between paragraphs, like in this location (same
goes for other commit descs).

Let's address such. If @napi_safe is true, we're fine, don't change
anything for this path. But if it's false, check whether we are in the
softirq context. It will most likely be so and then if ->list_owner
is our current CPU, we're good to use direct recycling, even though
@napi_safe is false -- concurrent access is excluded. in_softirq()
protection is needed mostly due to we can hit this place in the
process context (not the hardirq though).

This patch make me a little nervous, as it can create hard-to-debug bugs
if this isn't 100% correct. (Thanks for previous patch that exclude
hardirq via lockdep).

For the mentioned xdp-drop-skb-mode case, the improvement I got is
3-4% in Mpps. As for page_pool stats, recycle_ring is now 0 and
alloc_slow counter doesn't change most of time, which means the
MM layer is not even called to allocate any new pages.

Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx> # in_softirq()
Signed-off-by: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx>
---
net/core/skbuff.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index e701401092d7..5ba3948cceed 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -901,8 +901,10 @@ bool page_pool_return_skb_page(struct page *page, bool napi_safe)
/* Allow direct recycle if we have reasons to believe that we are
* in the same context as the consumer would run, so there's
* no possible race.
+ * __page_pool_put_page() makes sure we're not in hardirq context
+ * and interrupts are enabled prior to accessing the cache.
*/
- if (napi_safe) {
+ if (napi_safe || in_softirq()) {

I used to have in_serving_softirq() in PP to exclude process context
that just disabled BH to do direct recycling (into a lockless array).
This changed in kernel v6.3 commit 542bcea4be86 ("net: page_pool: use
in_softirq() instead") to help threaded NAPI. I guess, nothing blew up
so I guess this was okay to relax this.

const struct napi_struct *napi = READ_ONCE(pp->p.napi);
allow_direct = napi &&

AFAIK this in_softirq() will allow process context with disabled BH to
also recycle directly into the PP lockless array. With the additional
checks (that are just outside above diff-context) that I assume makes
sure CPU (smp_processor_id()) also match. Is this safe?

--Jesper