Re: [net-next] ipv6: fix routing cache overflow for raw sockets

From: Andrea Mayer
Date: Fri Dec 23 2022 - 15:43:47 EST


Hi Jon,
please see below, thanks.

On Wed, 21 Dec 2022 08:48:11 +1100
Jonathan Maxwell <jmaxwell37@xxxxxxxxx> wrote:

> On Tue, Dec 20, 2022 at 11:35 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> >
> > On Mon, 2022-12-19 at 10:48 +1100, Jon Maxwell wrote:
> > > Sending Ipv6 packets in a loop via a raw socket triggers an issue where a
> > > route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly
> > > consumes the Ipv6 max_size threshold which defaults to 4096 resulting in
> > > these warnings:
> > >
> > > [1] 99.187805] dst_alloc: 7728 callbacks suppressed
> > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > .
> > > .
> > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> >
> > If I read correctly, the maximum number of dst that the raw socket can
> > use this way is limited by the number of packets it allows via the
> > sndbuf limit, right?
> >
>
> Yes, but in my test sndbuf limit is never hit so it clones a route for
> every packet.
>
> e.g:
>
> output from C program sending 5000000 packets via a raw socket.
>
> ip raw: total num pkts 5000000
>
> # bpftrace -e 'kprobe:dst_alloc {@count[comm] = count()}'
> Attaching 1 probe...
>
> @count[a.out]: 5000009
>
> > Are other FLOWI_FLAG_KNOWN_NH users affected, too? e.g. nf_dup_ipv6,
> > ipvs, seg6?
> >
>
> Any call to ip6_pol_route(s) where no res.nh->fib_nh_gw_family is 0 can do it.
> But we have only seen this for raw sockets so far.
>

In the SRv6 subsystem, the seg6_lookup_nexthop() is used by some
cross-connecting behaviors such as End.X and End.DX6 to forward traffic to a
specified nexthop. SRv6 End.X/DX6 can specify an IPv6 DA (i.e., a nexthop)
different from the one carried by the IPv6 header. For this purpose,
seg6_lookup_nexthop() sets the FLOWI_FLAG_KNOWN_NH.

> > > [1] 99.187805] dst_alloc: 7728 callbacks suppressed
> > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > .
> > > .
> > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.

I can reproduce the same warning messages reported by you, by instantiating an
End.X behavior whose nexthop is handled by a route for which there is no "via".
In this configuration, the ip6_pol_route() (called by seg6_lookup_nexthop())
triggers ip6_rt_cache_alloc() because i) the FLOWI_FLAG_KNOWN_NH is present ii)
and the res.nh->fib_nh_gw_family is 0 (as already pointed out).

> Regards
>
> Jon

Ciao,
Andrea