[PATCH 4.4 17/28] ipv6: make ECMP route replacement less greedy

From: Greg Kroah-Hartman
Date: Mon Mar 20 2017 - 13:51:17 EST


4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sabrina Dubroca <sd@xxxxxxxxxxxxxxx>


[ Upstream commit 67e194007be08d071294456274dd53e0a04fdf90 ]

Commit 27596472473a ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.

Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.

Reproducer:
===========
#!/bin/sh

ip netns add ns1
ip netns add ns2
ip -net ns1 link set lo up

for x in 0 1 2 ; do
ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
ip -net ns1 link set eth$x up
ip -net ns2 link set veth$x up
done

ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048

echo "before replace, 3 routes"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
echo

ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2

echo "after replace, only 2 routes, metric 2048 is gone"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Sabrina Dubroca <sd@xxxxxxxxxxxxxxx>
Acked-by: Nicolas Dichtel <nicolas.dichtel@xxxxxxxxx>
Reviewed-by: Xin Long <lucien.xin@xxxxxxxxx>
Reviewed-by: Michal Kubecek <mkubecek@xxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
net/ipv6/ip6_fib.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -903,6 +903,8 @@ add:
ins = &rt->dst.rt6_next;
iter = *ins;
while (iter) {
+ if (iter->rt6i_metric > rt->rt6i_metric)
+ break;
if (rt6_qualify_for_ecmp(iter)) {
*ins = iter->dst.rt6_next;
fib6_purge_rt(iter, fn, info->nl_net);