Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets

From: Florian Westphal
Date: Sat Sep 30 2023 - 07:09:15 EST


Yan Zhai <yan@xxxxxxxxxxxxxx> wrote:
> GSO packets can contain a trailing segment that is smaller than
> gso_size. When examining the dst MTU for such packet, if its gso_size
> is too large, then all segments would be fragmented. However, there is a
> good chance the trailing segment has smaller actual size than both
> gso_size as well as the MTU, which leads to an "atomic fragment".
> RFC-8021 explicitly recommend to deprecate such use case. An Existing
> report from APNIC also shows that atomic fragments can be dropped
> unexpectedly along the path [1].
>
> Add an extra check in ip6_fragment to catch all possible generation of
> atomic fragments. Skip atomic header if it is called on a packet no
> larger than MTU.
>
> Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
> Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
> Reported-by: David Wragg <dwragg@xxxxxxxxxxxxxx>
> Signed-off-by: Yan Zhai <yan@xxxxxxxxxxxxxx>
> ---
> net/ipv6/ip6_output.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 951ba8089b5b..42f5f68a6e24 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
> __be32 frag_id;
> u8 *prevhdr, nexthdr = 0;
>
> + /* RFC-8021 recommended atomic fragments to be deprecated. Double check
> + * the actual packet size before fragment it.
> + */
> + mtu = ip6_skb_dst_mtu(skb);
> + if (unlikely(skb->len <= mtu))
> + return output(net, sk, skb);
> +

This helper is also called for skbs where IP6CB(skb)->frag_max_size
exceeds the MTU, so this check looks wrong to me.

Same remark for dst_allfrag() check in __ip6_finish_output(),
after this patch, it would be ignored.

I think you should consider to first refactor __ip6_finish_output to make
the existing checks more readable (e.g. handle gso vs. non-gso in separate
branches) and then add the check to last seg in
ip6_finish_output_gso_slowpath_drop().

Alternatively you might be able to pass more info down to
ip6_fragment and move decisions there.

In any case we should make same frag-or-no-frag decisions,
regardless of this being the orig skb or a segmented one,