Re: [PATCH net] net: vrf: remove redundant vrf neigh entry

From: David Ahern
Date: Mon Apr 15 2019 - 15:25:52 EST


On 4/10/19 9:39 PM, linmiaohe wrote:
>
> On 2019/3/22 23:50, David Ahern wrote:
>> On 3/22/19 3:10 PM, linmiaohe wrote:
>>> From: Miaohe Lin <linmiaohe@xxxxxxxxxx>
>>>
>>> When vrf->rth is created, it wouldn't change in his lifetime.And in
>>> func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
>>> as key because rth->rt_gateway is equal to 0. But I think we only need
>>> one vrf neigh entry because we strip the ethernet header and reset the
>>> dst_entry in vrf_process_v4_outbound.
>>> So I set rth->rt_gateway to loopback addr(It's ok to set any other
>>> valid ip address, just choose one.). And we would only create one vrf
>>> neigh entry. This helps saving some memory and improving the hitting
>>> rate of neigh lookup.
>>> If there is something I misunderstand, it's very nice of you to
>>> let me know. Thanks a lot.
>>>
>>> Signed-off-by: linmiaohe <linmiaohe@xxxxxxxxxx>
>>> ---
>>> drivers/net/vrf.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
>>> index 7c1430ed0244..2b0227fb8f53 100644
>>> --- a/drivers/net/vrf.c
>>> +++ b/drivers/net/vrf.c
>>> @@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
>>> return -ENOMEM;
>>>
>>> rth->dst.output = vrf_output;
>>> + rth->rt_gateway = htonl(INADDR_LOOPBACK);
>>>
>>> rcu_assign_pointer(vrf->rth, rth);
>>>
>>
>> Did you investigate how neighbor entries are getting created? The vrf
>> device has IFF_NOARP set, so neigh entries should not be created.
>>
>> .
>>
> Hi,David A.,I investigate how neighbor entries are getting created recently.
> But I can't find where neigh entries is not created when vrf device has
> IFF_NOARP set.
> So I add some printk info,and I ping the different host, here is the output:
>
> [root@localhost ~]# ip vrf exec vrf1 ping 10.0.0.2
> PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
> 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=1.78 ms
> ^C
> --- 10.0.0.2 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.776/1.776/1.776/0.000 ms
> [root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.2
> PING 11.0.0.2 (11.0.0.2) 56(84) bytes of data.
> 64 bytes from 11.0.0.2: icmp_seq=1 ttl=64 time=1.59 ms
> ^C
> --- 11.0.0.2 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.591/1.591/1.591/0.000 ms
> [root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.3
> PING 11.0.0.3 (11.0.0.3) 56(84) bytes of data.
> ^C
> --- 11.0.0.3 ping statistics ---
> 1 packets transmitted, 0 received, 100% packet loss, time 0ms
>
> Apr 11 11:01:48 localhost kernel: [ 337.311270] VRF: IFF_NOARP is set
> Apr 11 11:01:48 localhost kernel: [ 337.311279] VRF: nexthop = 200000a
> Apr 11 11:01:48 localhost kernel: [ 337.311284] VRF: neigh = (null) after lookup
> Apr 11 11:01:48 localhost kernel: [ 337.311294] VRF: we create a neigh 000000001e8acd79
> Apr 11 11:01:51 localhost kernel: [ 340.026623] VRF: IFF_NOARP is set
> Apr 11 11:01:51 localhost kernel: [ 340.026627] VRF: nexthop = 200000b
> Apr 11 11:01:51 localhost kernel: [ 340.026631] VRF: neigh = (null) after lookup
> Apr 11 11:01:51 localhost kernel: [ 340.026637] VRF: we create a neigh 00000000a0ad96da
> Apr 11 11:01:56 localhost kernel: [ 345.157529] VRF: IFF_NOARP is set
> Apr 11 11:01:56 localhost kernel: [ 345.157539] VRF: nexthop = 300000b
> Apr 11 11:01:56 localhost kernel: [ 345.157544] VRF: neigh = (null) after lookup
> Apr 11 11:01:56 localhost kernel: [ 345.157556] VRF: we create a neigh 00000000a5167b56
>
> And here is the printk code:
>
> if (vrf_dev->flags & IFF_NOARP) {
> printk(KERN_ERR "VRF: IFF_NOARP is set\n");
> rth = rcu_dereference(vrf->rth);
> nexthop = (__force u32)rt_nexthop(rth, ip_hdr(skb)->daddr);
> printk(KERN_ERR "VRF: nexthop = %x\n", nexthop);
> neigh = __ipv4_neigh_lookup_noref(vrf_dev, nexthop);
> printk(KERN_ERR "VRF: neigh = %p after lookup\n", (void *)neigh);
> if (unlikely(!neigh)) {
> neigh = __neigh_create(&arp_tbl, &nexthop, vrf_dev, false);
> printk(KERN_ERR "VRF: we create a neigh %p\n", (void *)neigh);
> }
> }
>
> Could you please tell me if I was misunderstanding something again? It's very nice
> of you if you can figure me out that. Thanks a lot.I am looking forward to your reply.
>

In the above statements you are passing vrf_dev to neigh_create, so of
course it is creating an entry against that device. That should not be
happening. When dst->dev is a vrf device it is for local traffic only
(locally originated to a local address) and there are no neighbor
entries for that. Otherwise, it is an enslaved device and the
neigh_create is always done with it.