Re: [PATCH] neighbour: purge nf_bridged skb from foreign device neigh

From: Eric Dumazet
Date: Mon Jan 08 2024 - 04:11:15 EST


On Mon, Jan 8, 2024 at 9:52 AM Pavel Tikhomirov
<ptikhomirov@xxxxxxxxxxxxx> wrote:
>
> An skb can be added to a neigh->arp_queue while waiting for an arp
> reply. Where original skb's skb->dev can be different to neigh's
> neigh->dev. For instance in case of bridging dnated skb from one veth to
> another, the skb would be added to a neigh->arp_queue of the bridge.
>
> There is no explicit mechanism that prevents the original skb->dev link
> of such skb from being freed under us. For instance neigh_flush_dev does
> not cleanup skbs from different device's neigh queue. But that original
> link can be used and lead to crash on e.g. this stack:
>
> arp_process
> neigh_update
> skb = __skb_dequeue(&neigh->arp_queue)
> neigh_resolve_output(..., skb)
> ...
> br_nf_dev_xmit
> br_nf_pre_routing_finish_bridge_slow
> skb->dev = nf_bridge->physindev
> br_handle_frame_finish
>
> So let's improve neigh_flush_dev to also purge skbs when device
> equal to their skb->nf_bridge->physindev gets destroyed.
>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov@xxxxxxxxxxxxx>
> ---
> I'm not fully sure, but likely it is:
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> ---
> net/core/neighbour.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 552719c3bbc3d..47d2d52f17da3 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -39,6 +39,9 @@
> #include <linux/inetdevice.h>
> #include <net/addrconf.h>
>
> +#include <linux/skbuff.h>
> +#include <linux/netfilter_bridge.h>
> +
> #include <trace/events/neigh.h>
>
> #define NEIGH_DEBUG 1
> @@ -377,6 +380,28 @@ static void pneigh_queue_purge(struct sk_buff_head *list, struct net *net,
> }
> }
>
> +static void neigh_purge_nf_bridge_dev(struct neighbour *neigh, struct net_device *dev)
> +{
> + struct sk_buff_head *list = &neigh->arp_queue;
> + struct nf_bridge_info *nf_bridge;
> + struct sk_buff *skb, *next;
> +
> + write_lock(&neigh->lock);
> + skb = skb_peek(list);
> + while (skb) {
> + nf_bridge = nf_bridge_info_get(skb);

This depends on CONFIG_BRIDGE_NETFILTER

Can we solve this issue without adding another layer violation ?

> +
> + next = skb_peek_next(skb, list);
> + if (nf_bridge && nf_bridge->physindev == dev) {
> + __skb_unlink(skb, list);
> + neigh->arp_queue_len_bytes -= skb->truesize;
> + kfree_skb(skb);
> + }
> + skb = next;
> + }
> + write_unlock(&neigh->lock);
> +}
> +
> static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev,
> bool skip_perm)
> {
> @@ -393,6 +418,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev,
> while ((n = rcu_dereference_protected(*np,
> lockdep_is_held(&tbl->lock))) != NULL) {
> if (dev && n->dev != dev) {
> + neigh_purge_nf_bridge_dev(n, dev);
> np = &n->next;
> continue;
> }
> --
> 2.43.0
>