[PATCH net-next 23/24] net: qlogic, socionext, stmmac, cpsw: Use nested-BH locking for XDP redirect.

From: Sebastian Andrzej Siewior
Date: Fri Dec 15 2023 - 12:18:29 EST


The per-CPU variables used during bpf_prog_run_xdp() invocation and
later during xdp_do_redirect() rely on disabled BH for their protection.
Without locking in local_bh_disable() on PREEMPT_RT these data structure
require explicit locking.

This is a follow-up on the previous change which introduced
bpf_run_lock.redirect_lock and uses it now within drivers.

The simple way is to acquire the lock before bpf_prog_run_xdp() is
invoked and hold it until the end of function.
This does not always work because some drivers (cpsw, atlantic) invoke
xdp_do_flush() in the same context.
Acquiring the lock in bpf_prog_run_xdp() and dropping in
xdp_do_redirect() (without touching drivers) does not work because not
all driver, which use bpf_prog_run_xdp(), do support XDP_REDIRECT (and
invoke xdp_do_redirect()).

Ideally the minimal locking scope would be bpf_prog_run_xdp() +
xdp_do_redirect() and everything else (error recovery, DMA unmapping,
free/ alloc of memory, …) would happen outside of the locked section.

Cc: Alexandre Torgue <alexandre.torgue@xxxxxxxxxxx>
Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
Cc: Ariel Elior <aelior@xxxxxxxxxxx>
Cc: Ilias Apalodimas <ilias.apalodimas@xxxxxxxxxx>
Cc: Jassi Brar <jaswinder.singh@xxxxxxxxxx>
Cc: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
Cc: John Fastabend <john.fastabend@xxxxxxxxx>
Cc: Jose Abreu <joabreu@xxxxxxxxxxxx>
Cc: Manish Chopra <manishc@xxxxxxxxxxx>
Cc: Maxime Coquelin <mcoquelin.stm32@xxxxxxxxx>
Cc: Ravi Gunasekaran <r-gunasekaran@xxxxxx>
Cc: Roger Quadros <rogerq@xxxxxxxxxx>
Cc: Siddharth Vadapalli <s-vadapalli@xxxxxx>
Cc: bpf@xxxxxxxxxxxxxxx
Cc: linux-omap@xxxxxxxxxxxxxxx
Cc: linux-stm32@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---
drivers/net/ethernet/qlogic/qede/qede_fp.c | 1 +
drivers/net/ethernet/socionext/netsec.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 +
drivers/net/ethernet/ti/cpsw_priv.c | 15 +++++++++------
4 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index cb1746bc0e0c5..ce5af094fb817 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -1091,6 +1091,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
xdp_prepare_buff(&xdp, page_address(bd->data), *data_offset,
*len, false);

+ guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
act = bpf_prog_run_xdp(prog, &xdp);

/* Recalculate, as XDP might have changed the headers */
diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
index 0891e9e49ecb5..47e314338f3f3 100644
--- a/drivers/net/ethernet/socionext/netsec.c
+++ b/drivers/net/ethernet/socionext/netsec.c
@@ -905,6 +905,7 @@ static u32 netsec_run_xdp(struct netsec_priv *priv, struct bpf_prog *prog,
int err;
u32 act;

+ guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
act = bpf_prog_run_xdp(prog, xdp);

/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 37e64283f9107..9e92affc8c22c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4893,6 +4893,7 @@ static int __stmmac_xdp_run_prog(struct stmmac_priv *priv,
u32 act;
int res;

+ guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
act = bpf_prog_run_xdp(prog, xdp);
switch (act) {
case XDP_PASS:
diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c
index 764ed298b5708..f38c49f9fab35 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.c
+++ b/drivers/net/ethernet/ti/cpsw_priv.c
@@ -1335,9 +1335,15 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
if (!prog)
return CPSW_XDP_PASS;

- act = bpf_prog_run_xdp(prog, xdp);
- /* XDP prog might have changed packet data and boundaries */
- *len = xdp->data_end - xdp->data;
+ scoped_guard(local_lock_nested_bh, &bpf_run_lock.redirect_lock) {
+ act = bpf_prog_run_xdp(prog, xdp);
+ /* XDP prog might have changed packet data and boundaries */
+ *len = xdp->data_end - xdp->data;
+ if (act == XDP_REDIRECT) {
+ if (xdp_do_redirect(ndev, xdp, prog))
+ goto drop;
+ }
+ }

switch (act) {
case XDP_PASS:
@@ -1352,9 +1358,6 @@ int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
xdp_return_frame_rx_napi(xdpf);
break;
case XDP_REDIRECT:
- if (xdp_do_redirect(ndev, xdp, prog))
- goto drop;
-
/* Have to flush here, per packet, instead of doing it in bulk
* at the end of the napi handler. The RX devices on this
* particular hardware is sharing a common queue, so the
--
2.43.0