Re: [RFC PATCH v2 0/3] l3mdev icmp error route lookup fixes

From: Mathieu Desnoyers
Date: Mon Sep 21 2020 - 15:33:44 EST


----- On Sep 21, 2020, at 3:11 PM, David Ahern dsahern@xxxxxxxxx wrote:

> On 9/21/20 12:44 PM, Mathieu Desnoyers wrote:
>> ----- On Sep 21, 2020, at 2:36 PM, David Ahern dsahern@xxxxxxxxx wrote:
>>
>>> On 9/18/20 12:17 PM, Mathieu Desnoyers wrote:
>>>> Hi,
>>>>
>>>> Here is an updated series of fixes for ipv4 and ipv6 which which ensure
>>>> the route lookup is performed on the right routing table in VRF
>>>> configurations when sending TTL expired icmp errors (useful for
>>>> traceroute).
>>>>
>>>> It includes tests for both ipv4 and ipv6.
>>>>
>>>> These fixes address specifically address the code paths involved in
>>>> sending TTL expired icmp errors. As detailed in the individual commit
>>>> messages, those fixes do not address similar issues related to network
>>>> namespaces and unreachable / fragmentation needed messages, which appear
>>>> to use different code paths.
>>>>
>>>
>>> New selftests are failing:
>>> TEST: Ping received ICMP frag needed [FAIL]
>>>
>>> Both IPv4 and IPv6 versions are failing.
>>
>> Indeed, this situation is discussed in each patch commit message:
>>
>> ipv4:
>>
>> [ It has also been pointed out that a similar issue exists with
>> unreachable / fragmentation needed messages, which can be triggered by
>> changing the MTU of eth1 in r1 to 1400 and running:
>>
>> ip netns exec h1 ping -s 1450 -Mdo -c1 172.16.2.2
>>
>> Some investigation points to raw_icmp_error() and raw_err() as being
>> involved in this last scenario. The focus of this patch is TTL expired
>> ICMP messages, which go through icmp_route_lookup.
>> Investigation of failure modes related to raw_icmp_error() is beyond
>> this investigation's scope. ]
>>
>> ipv6:
>>
>> [ Testing shows that similar issues exist with ipv6 unreachable /
>> fragmentation needed messages. However, investigation of this
>> additional failure mode is beyond this investigation's scope. ]
>>
>> I do not have the time to investigate further unfortunately, so I
>> thought it best to post what I have.
>>
>
> the test setup is bad. You have r1 dropping the MTU in VRF red, but not
> telling VRF red how to send back the ICMP. e.g., for IPv4 add:
>
> ip -netns r1 ro add vrf red 172.16.1.0/24 dev blue
>
> do the same for v6.
>
> Also, I do not see a reason for r2; I suggest dropping it. What you are
> testing is icmp crossing VRF with route leaking, so there should not be
> a need for r2 which leads to asymmetrical routing (172.16.1.0 via r1 and
> the return via r2).

CCing Michael Jeanson, author of the selftests patch.

Thanks for your feedback,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com