Re: [PATCH net] ipv6: avoid atomic fragment on GSO packets

From: Willem de Bruijn
Date: Mon Oct 02 2023 - 02:53:10 EST


On Sat, Sep 30, 2023 at 1:09 PM Florian Westphal <fw@xxxxxxxxx> wrote:
>
> Yan Zhai <yan@xxxxxxxxxxxxxx> wrote:
> > GSO packets can contain a trailing segment that is smaller than
> > gso_size. When examining the dst MTU for such packet, if its gso_size
> > is too large, then all segments would be fragmented. However, there is a
> > good chance the trailing segment has smaller actual size than both
> > gso_size as well as the MTU, which leads to an "atomic fragment".
> > RFC-8021 explicitly recommend to deprecate such use case. An Existing
> > report from APNIC also shows that atomic fragments can be dropped
> > unexpectedly along the path [1].
> >
> > Add an extra check in ip6_fragment to catch all possible generation of
> > atomic fragments. Skip atomic header if it is called on a packet no
> > larger than MTU.
> >
> > Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
> > Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
> > Reported-by: David Wragg <dwragg@xxxxxxxxxxxxxx>
> > Signed-off-by: Yan Zhai <yan@xxxxxxxxxxxxxx>
> > ---
> > net/ipv6/ip6_output.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index 951ba8089b5b..42f5f68a6e24 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -854,6 +854,13 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
> > __be32 frag_id;
> > u8 *prevhdr, nexthdr = 0;
> >
> > + /* RFC-8021 recommended atomic fragments to be deprecated. Double check
> > + * the actual packet size before fragment it.
> > + */
> > + mtu = ip6_skb_dst_mtu(skb);
> > + if (unlikely(skb->len <= mtu))
> > + return output(net, sk, skb);
> > +
>
> This helper is also called for skbs where IP6CB(skb)->frag_max_size
> exceeds the MTU, so this check looks wrong to me.
>
> Same remark for dst_allfrag() check in __ip6_finish_output(),
> after this patch, it would be ignored.
>
> I think you should consider to first refactor __ip6_finish_output to make
> the existing checks more readable (e.g. handle gso vs. non-gso in separate
> branches) and then add the check to last seg in
> ip6_finish_output_gso_slowpath_drop().
>
> Alternatively you might be able to pass more info down to
> ip6_fragment and move decisions there.
>
> In any case we should make same frag-or-no-frag decisions,
> regardless of this being the orig skb or a segmented one,

To add to that: if this is a suggestion to update the algorithm to
match RFC 8021, not a fix for a bug in the current implementation,
then I think this should target net-next.

That will also make it easier to include the kind of refactoring that
Florian suggests.