Re: [PATCH] neighbour: purge nf_bridged skb from foreign device neigh

From: Pavel Tikhomirov
Date: Wed Jan 10 2024 - 06:17:01 EST


Here is the new version https://lore.kernel.org/netdev/20240110110451.5473-1-ptikhomirov@xxxxxxxxxxxxx/

On 09/01/2024 19:12, Florian Westphal wrote:
Pavel Tikhomirov <ptikhomirov@xxxxxxxxxxxxx> wrote:
index f980edfdd2783..105fbdb029261 100644
--- a/include/linux/netfilter_bridge.h
+++ b/include/linux/netfilter_bridge.h
@@ -56,11 +56,15 @@ static inline int nf_bridge_get_physoutif(const struct
sk_buff *skb)
}

static inline struct net_device *
-nf_bridge_get_physindev(const struct sk_buff *skb)
+nf_bridge_get_physindev_rcu(const struct sk_buff *skb)
{
const struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
+ struct net_device *dev;

- return nf_bridge ? nf_bridge->physindev : NULL;
+ if (!nf_bridge || !skb->dev)
+ return 0;
+
+ return dev_get_by_index_rcu(skb->dev->net, nf_bridge->physindev_if);

You could use dev_net(skb->dev), yes.

In br_nf_pre_routing_finish_bridge_slow I had to use dev_net(skb->dev).


Or create a preparation patch that does:

-nf_bridge_get_physindev(const struct sk_buff *skb)
+nf_bridge_get_physindev(const struct sk_buff *skb, struct net *net)

(all callers have a struct net available).

For all other cases I did the prep-patch propagating net.


No need to rename the function, see below.

- br_indev = nf_bridge_get_physindev(oldskb);
+ rcu_read_lock_bh();
+ br_indev = nf_bridge_get_physindev_rcu(oldskb);

No need for rcu read lock, all netfilter hooks run inside
rcu_read_lock().

Thanks for this hint! I have checked all those tons of cases and actually proved to myself that all cases have rcu_read_lock =)


Does it sound good?

Yes, seems ok to me.

Or maybe instead we can have extra physindev_if field in addition to
existing physindev to only do dev_get_by_index_rcu inside
br_nf_pre_routing_finish_bridge_slow to doublecheck the ->physindev link?

Sorry in advance if I'm missing anything obvious.

Alternative would be to add a 'br_nf_unreg_serno' that gets incremented
from brnf_device_event(), then store that in nf_bridge_info struct and
compare to current value before net_device deref. If not equal, toss skb.

Problem is that we'd need some indirection to retrieve the current
value, otherwise places like nfnetlink_log() gain a module dependency on
br_netfilter :-(

We'd likely need
const atomic_t *br_nf_unreg_serno __read_mostly;
EXPORT_SYMBOL_GPL(br_nf_unreg_serno);

in net/netfilter/core.c for this, then set/clear the
pointer from br_netfilter_hooks.c.

I can't say/don't know which of the two options is better/worse.

s/struct net_device */int// has the benefit of shrinking nf_bridge_info,
so I'd try that first.

Ok, did s/struct net_device */int// variant.

--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.