Re: [PATCH v5 bpf 1/4] lwt: fix return values of BPF ops

From: Yan Zhai
Date: Tue Aug 15 2023 - 23:06:08 EST


On Tue, Aug 15, 2023 at 9:54 PM Yan Zhai <yan@xxxxxxxxxxxxxx> wrote:
>
> BPF encap ops can return different types of positive values, such like
> NET_RX_DROP, NET_XMIT_CN, NETDEV_TX_BUSY, and so on, from function
> skb_do_redirect and bpf_lwt_xmit_reroute. At the xmit hook, such return
> values would be treated implicitly as LWTUNNEL_XMIT_CONTINUE in
> ip(6)_finish_output2. When this happens, skbs that have been freed would
> continue to the neighbor subsystem, causing use-after-free bug and
> kernel crashes.
>
> To fix the incorrect behavior, skb_do_redirect return values can be
> simply discarded, the same as tc-egress behavior. On the other hand,
> bpf_lwt_xmit_reroute returns useful errors to local senders, e.g. PMTU
> information. Thus convert its return values to avoid the conflict with
> LWTUNNEL_XMIT_CONTINUE.
>
> Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
> Suggested-by: Martin KaFai Lau <martin.lau@xxxxxxxxx>
> Suggested-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> Reported-by: Jordan Griege <jgriege@xxxxxxxxxxxxxx>
> Signed-off-by: Yan Zhai <yan@xxxxxxxxxxxxxx>
> ---
> * v5: discards skb_do_redirect return instead; convert
> bpf_lwt_xmit_reroute return;
> * v4: minor commit message changes
> * v3: converts skb_do_redirect statuses from both ingress and egress
> * v2: code style amend
> ---
> net/core/lwt_bpf.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
> index 8b6b5e72b217..4a0797f0a154 100644
> --- a/net/core/lwt_bpf.c
> +++ b/net/core/lwt_bpf.c
> @@ -60,9 +60,8 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
> ret = BPF_OK;
> } else {
> skb_reset_mac_header(skb);
> - ret = skb_do_redirect(skb);
> - if (ret == 0)
> - ret = BPF_REDIRECT;
> + skb_do_redirect(skb);
> + ret = BPF_REDIRECT;
> }
> break;
>
> @@ -255,7 +254,7 @@ static int bpf_lwt_xmit_reroute(struct sk_buff *skb)
>
> err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb);
> if (unlikely(err))
> - return err;
> + return net_xmit_errno(err);
>
> /* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */
> return LWTUNNEL_XMIT_DONE;
> --
> 2.30.2
>

no idea why this one would appear nested and without subject on the
lore link. Let me double check what goes wrong with my mutt setting :(

--
Yan