Re: [RFC PATCH net-next v2 0/5] netns: allow to identify peer netns

From: Nicolas Dichtel
Date: Wed Sep 24 2014 - 12:27:40 EST


Le 24/09/2014 18:01, Cong Wang a Ãcrit :
On Wed, Sep 24, 2014 at 2:23 AM, Nicolas Dichtel
<nicolas.dichtel@xxxxxxxxx> wrote:
Le 23/09/2014 21:22, Cong Wang a Ãcrit :

On Tue, Sep 23, 2014 at 6:20 AM, Nicolas Dichtel
<nicolas.dichtel@xxxxxxxxx> wrote:


Here is a small screenshot to show how it can be used by userland:
$ ip netns add foo
$ ip netns del foo
$ ip netns
$ touch /var/run/netns/init_net
$ mount --bind /proc/1/ns/net /var/run/netns/init_net
$ ip netns add foo
$ ip netns
foo (id: 3)
init_net (id: 1)
$ ip netns exec foo ip netns
foo (id: 3)
init_net (id: 1)
$ ip netns exec foo ip link add ipip1 link-netnsid 1 type ipip remote
10.16.0.121 local 10.16.0.249
$ ip netns exec foo ip l ls ipip1
6: ipip1@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode
DEFAULT group default
link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 1

The parameter link-netnsid shows us where the interface sends and
receives
packets (and thus we know where encapsulated addresses are set).


So ipip1 is shown in netns foo but functioning in netns init_net? Getting
the
id of init_net in foo depends on your mount namespace, /var/run/netns/ may
not visible inside foo, in this case, link-netnsid is meaningless. It
is not your
fault, network namespace already heavily relies on mount namespace (sysfs
needs to be remount otherwise you can not create device with the same
name.)

On the other hand, what's the problem you are trying to solve? AFAIK,
the ifindex
issue is purely in output, IOW, the device still functions correctly
even through
its link ifindex is not correct after moving to another namespace. If
not, it is bug
we need to fix.

The problem is explained here:
http://thread.gmane.org/gmane.linux.network/315933/focus=316064
and here:
http://thread.gmane.org/gmane.linux.kernel.containers/28301/focus=4239


Please, summarize the discussion in your changelog, instead of pointing
to a long thread.
The thread is long, but the mail in focus contains the information. Here is a copy and paste:
What I'm trying to solve is to have full info in netlink messages sent by the
kernel, thus beeing able to identify a peer netns (and this is close from what
audit guys are trying to have). Theorically, messages sent by the kernel can be
reused as is to have the same configuration. This is not the case with x-netns
devices. Here is an example, with ip tunnels:

$ ip netns add 1
$ ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249 dev eth0
$ ip -d link ls ipip1
8: ipip1 <at> eth0: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT
group default
link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0
ipip remote 10.16.0.121 local 10.16.0.249 dev eth0 ttl inherit pmtudisc
$ ip link set ipip1 netns 1
$ ip netns exec 1 ip -d link ls ipip1
8: ipip1 <at> tunl0: <POINTOPOINT,NOARP,M-DOWN> mtu 1480 qdisc noop state DOWN mode
DEFAULT group default
link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0
ipip remote 10.16.0.121 local 10.16.0.249 dev tunl0 ttl inherit pmtudisc

Now informations got with 'ip link' are wrong and incomplete:
- the link dev is now tunl0 instead of eth0, because we only got an ifindex
from the kernel without any netns informations.
- the encapsulation addresses are not part of this netns but the user doesn't
known that (still because netns info is missing). These IPv4 addresses may
exist into this netns.
- it's not possible to create the same netdevice with these infos.

Hope it's more clear now.


And clearly you missed my question above: how do you get netns id
without sharing /var/run/netns/ ?

You can get an id only if you already have a "pointer" to this netns.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/