RE: [PATCH net] udp: ipv4: manipulate network header of NATed UDP GRO fraglist

From: Dongseok Yi
Date: Fri Jan 15 2021 - 06:21:54 EST


On 2021-01-15 19:51, Alexander Lobakin wrote:
> From: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
> Date: Fri, 15 Jan 2021 10:27:52 +0100
>
> > On Fri, Jan 15, 2021 at 05:55:22PM +0900, Dongseok Yi wrote:
> >> On 2021-01-15 17:12, Steffen Klassert wrote:
> >>> On Fri, Jan 15, 2021 at 02:58:24PM +0900, Dongseok Yi wrote:
> >>>> UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
> >>>> forwarding. Only the header of head_skb from ip_finish_output_gso ->
> >>>> skb_gso_segment is updated but following frag_skbs are not updated.
> >>>>
> >>>> A call path skb_mac_gso_segment -> inet_gso_segment ->
> >>>> udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
> >>>> does not try to update UDP/IP header of the segment list.
> >>>
> >>> We still need to find out why it works for Alexander, but not for you.
> >>> Different usecases?
> >>
> >> This patch is not for
> >> https://lore.kernel.org/patchwork/patch/1364544/
> >> Alexander might want to call udp_gro_receive_segment even when
> >> !sk and ~NETIF_F_GRO_FRAGLIST.
> >
> > Yes, I know. But he said that fraglist GRO + NAT works for him.
> > I want to find out why it works for him, but not for you.
>
> I found that it worked for me because I advertised fraglist GSO
> support in my driver (and added actual support for xmitting
> fraglists). If so, kernel won't resegment GSO into a list of
> plain packets, so no __udp_gso_segment_list() will be called.
>
> I think it will break if I disable fraglist GSO feature through
> Ethtool, so I could test your patches.

Thanks for the reply. In my case I enabled NETIF_F_GRO_FRAGLIST on
my driver. It expected that NAT done on each skb of the forwarded
fraglist.

>
> >>>
> >>> I would not like to add this to a generic codepath. I think we can
> >>> relatively easy copy the full headers in skb_segment_list().
> >>
> >> I tried to copy the full headers with the similar approach, but it
> >> copies length too. Can we keep the length of each skb of the fraglist?
> >
> > Ah yes, good point.
> >
> > Then maybe you can move your approach into __udp_gso_segment_list()
> > so that we dont touch generic code.
> >

Okay, I will move it into __udp_gso_segment_list() on v3.

> >>
> >>>
> >>> I think about something like the (completely untested) patch below:
> >>>
> >>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >>> index f62cae3f75d8..63ae7f79fad7 100644
> >>> --- a/net/core/skbuff.c
> >>> +++ b/net/core/skbuff.c
> >>> @@ -3651,13 +3651,14 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >>> unsigned int offset)
> >>> {
> >>> struct sk_buff *list_skb = skb_shinfo(skb)->frag_list;
> >>> + unsigned int doffset = skb->data - skb_mac_header(skb);
> >>> unsigned int tnl_hlen = skb_tnl_header_len(skb);
> >>> unsigned int delta_truesize = 0;
> >>> unsigned int delta_len = 0;
> >>> struct sk_buff *tail = NULL;
> >>> struct sk_buff *nskb;
> >>>
> >>> - skb_push(skb, -skb_network_offset(skb) + offset);
> >>> + skb_push(skb, doffset);
> >>>
> >>> skb_shinfo(skb)->frag_list = NULL;
> >>>
> >>> @@ -3675,7 +3676,7 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >>> delta_len += nskb->len;
> >>> delta_truesize += nskb->truesize;
> >>>
> >>> - skb_push(nskb, -skb_network_offset(nskb) + offset);
> >>> + skb_push(nskb, doffset);
> >>>
> >>> skb_release_head_state(nskb);
> >>> __copy_skb_header(nskb, skb);
> >>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> >>> index ff39e94781bf..1181398378b8 100644
> >>> --- a/net/ipv4/udp_offload.c
> >>> +++ b/net/ipv4/udp_offload.c
> >>> @@ -190,9 +190,22 @@ EXPORT_SYMBOL(skb_udp_tunnel_segment);
> >>> static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb,
> >>> netdev_features_t features)
> >>> {
> >>> + struct sk_buff *list_skb = skb_shinfo(skb)->frag_list;
> >>> unsigned int mss = skb_shinfo(skb)->gso_size;
> >>> + unsigned int offset;
> >>>
> >>> - skb = skb_segment_list(skb, features, skb_mac_header_len(skb));
> >>> + skb_headers_offset_update(list_skb, skb_headroom(list_skb) - skb_headroom(skb));
> >>> +
> >>> + /* Check for header changes and copy the full header in that case. */
> >>> + if ((udp_hdr(skb)->dest == udp_hdr(list_skb)->dest) &&
> >>> + (udp_hdr(skb)->source == udp_hdr(list_skb)->source) &&
> >>> + (ip_hdr(skb)->daddr == ip_hdr(list_skb)->daddr) &&
> >>> + (ip_hdr(skb)->saddr == ip_hdr(list_skb)->saddr))
> >>> + offset = skb_mac_header_len(skb);
> >>> + else
> >>> + offset = skb->data - skb_mac_header(skb);
> >>> +
> >>> + skb = skb_segment_list(skb, features, offset);
> >>> if (IS_ERR(skb))
> >>> return skb;
> >>>
> >>>
> >>> After that you can apply the CSUM magic in __udp_gso_segment_list().
>
> I'll test and let you know if it works. If doesn't, I think I'll be
> able to get a working one based on this.
>
> >> Sorry, I don't know CSUM magic well. Is it used for checksum
> >> incremental update too?
> >
> > With that I meant the checksum updating you did in your patch.

Ah, I see.

>
> Thanks,
> Al