Re: POSSIBLE BUG: selftests/net/fcnal-test.sh: [FAIL] in vrf "bind - ns-B IPv6 LLA" test

From: Mirsad Goran Todorovac
Date: Tue Jun 06 2023 - 15:17:37 EST


On 6/6/23 20:50, Guillaume Nault wrote:
On Tue, Jun 06, 2023 at 04:28:02PM +0200, Mirsad Todorovac wrote:
On 6/6/23 16:11, Guillaume Nault wrote:
On Tue, Jun 06, 2023 at 03:57:35PM +0200, Mirsad Todorovac wrote:
+ if (oif) {
+ rcu_read_lock();
+ dev = dev_get_by_index_rcu(net, oif);
+ rcu_read_unlock();

You can't assume '*dev' is still valid after rcu_read_unlock() unless
you hold a reference on it.

+ rtnl_lock();
+ mdev = netdev_master_upper_dev_get(dev);
+ rtnl_unlock();

Because of that, 'dev' might have already disappeared at the time
netdev_master_upper_dev_get() is called. So it may dereference an
invalid pointer here.

Good point, thanks. I didn't expect those to change.

This can be fixed, provided that RCU and RTNL locks can be nested:

Well, yes and no. You can call rcu_read_{lock,unlock}() while under the
rtnl protection, but not the other way around.

rcu_read_lock();
if (oif) {
dev = dev_get_by_index_rcu(net, oif);
rtnl_lock();
mdev = netdev_master_upper_dev_get(dev);
rtnl_unlock();
}

This is invalid: rtnl_lock() uses a mutex, so it can sleep and that's
forbidden inside an RCU critical section.

Obviously, that's bad. Mea culpa.

if (sk->sk_bound_dev_if) {
bdev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
}

addr_type = ipv6_addr_type(daddr);
if ((__ipv6_addr_needs_scope_id(addr_type) && !oif) ||
(addr_type & IPV6_ADDR_MAPPED) ||
(oif && sk->sk_bound_dev_if && oif != sk->sk_bound_dev_if &&
!(mdev && sk->sk_bound_dev_if && bdev && mdev == bdev))) {
rcu_read_unlock();
return -EINVAL;
}
rcu_read_unlock();

But again this is still probably not race-free (bdev might also disappear before
the mdev == bdev test), even if it passed fcnal-test.sh, there is much duplication
of code, so your one-line solution is obviously by far better. :-)

The real problem is choosing the right function for getting the master
device. In particular netdev_master_upper_dev_get() was a bad choice.
It forces you to take the rtnl, which is unnatural here and obliges you
to add extra code, while all this shouldn't be necessary in the first
place.

Thank you for the additional insight. I had poor luck with Googling on
these.

I made a blunder after blunder. But it was insightful and brainstorming.
Good exercise for my little grey cells.

However, learning without making any errors appears to be simply a lot
of blunt memorising. :-/

It's good to be in an environment when one can learn from errors.

:-)

Regards,
Mirsad