[PATCH RFC net-next v2 0/7] net: page_pool: a couple of assorted optimizations

From: Alexander Lobakin
Date: Fri Jul 14 2023 - 13:10:26 EST


Here's spin-off of the IAVF PP series[0], with 1 compile-time and several
runtime (hotpath) optimizations. They're based and tested on top of the
hybrid PP allocation series[1], but don't require it to work and are
in general independent of it and each other.

Per-patch breakdown:
#1: Already was on the lists, but this time it's done the other way,
the one that Alex Duyck proposed during the review of the previous
series. Slightly reduce the amount of C preprocessing by stopping
including <net/page_pool.h> to <linux/skbuff.h> (which is included
in half of the kernel sources). Especially useful with the
abovementioned series applied, as it makes page_pool.h heavier;
#2: New. Group frag_* fields of &page_pool together to reduce cache
misses;
#3-4: New, prereqs to #5. Free 4 bytes in &page_pool_params and combine it
with the already existing hole to get a free slot in the same CL
where the params are inside &page_pool. Use it to store the internal
PP flags in opposite to the driver-set ones;
#5: Don't call to DMA sync externals when they won't do anything anyway
by doing some heuristics a bit earlier (when allocating a new page).
Also was on the lists;
#6-7: New. In addition to recycling skb PP pages directly when @napi_safe
is set, check for the context we're in and always try to recycle
directly when in softirq (on the same CPU where the consumer runs).
This allows us to use direct recycling anytime we're inside a NAPI
polling loop or GRO stuff going right after it, covering way more
cases than it does right now.

(complete tree with [1] + this + [0] is available here: [2])

[0] https://lore.kernel.org/netdev/20230530150035.1943669-1-aleksander.lobakin@xxxxxxxxx
[1] https://lore.kernel.org/netdev/20230629120226.14854-1-linyunsheng@xxxxxxxxxx
[2] https://github.com/alobakin/linux/commits/iavf-pp-frag

Alexander Lobakin (7):
net: skbuff: don't include <net/page_pool.h> to <linux/skbuff.h>
net: page_pool: place frag_* fields in one cacheline
net: page_pool: shrink &page_pool_params a tiny bit
net: page_pool: don't use driver-set flags field directly
net: page_pool: avoid calling no-op externals when possible
net: skbuff: avoid accessing page_pool if !napi_safe when returning
page
net: skbuff: always try to recycle PP pages directly when in softirq

drivers/net/ethernet/engleder/tsnep_main.c | 1 +
drivers/net/ethernet/freescale/fec_main.c | 1 +
.../marvell/octeontx2/nic/otx2_common.c | 1 +
.../ethernet/marvell/octeontx2/nic/otx2_pf.c | 1 +
.../ethernet/mellanox/mlx5/core/en/params.c | 1 +
.../net/ethernet/mellanox/mlx5/core/en/xdp.c | 1 +
drivers/net/wireless/mediatek/mt76/mt76.h | 1 +
include/linux/skbuff.h | 3 +-
include/net/page_pool.h | 23 +++---
net/core/page_pool.c | 70 +++++--------------
net/core/skbuff.c | 41 +++++++++++
11 files changed, 83 insertions(+), 61 deletions(-)

---
>From RFC v1[3]:
* #1: move the entire function to skbuff.c, don't try to split it (Alex);
* #2-4: new;
* #5: use internal flags field added in #4 and don't modify driver-defined
structure (Alex, Jakub);
* #6: new;
* drop "add new NAPI state" as a redundant complication;
* #7: replace the check for the new NAPI state to just in_softirq(), should
be fine (Jakub).

[3] https://lore.kernel.org/netdev/20230629152305.905962-1-aleksander.lobakin@xxxxxxxxx
--
2.41.0