Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

From: Ian Kumlien
Date: Mon Jun 26 2023 - 13:25:10 EST


On Mon, Jun 26, 2023 at 7:15 PM Alexander Lobakin
<aleksander.lobakin@xxxxxxxxx> wrote:
>
> From: Ian Kumlien <ian.kumlien@xxxxxxxxx>
> Date: Mon, 26 Jun 2023 16:25:24 +0200
>
> > On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
> > <aleksander.lobakin@xxxxxxxxx> wrote:
> >>
> >> From: Ian Kumlien <ian.kumlien@xxxxxxxxx>
> >> Date: Sun, 25 Jun 2023 12:59:54 +0200
> >>
> >>> It could actually be that it's related to: rx-gro-list but
> >>> rx-udp-gro-forwarding makes it trigger quicker... I have yet to
> >>> trigger it on igb
> >>
> >> Hi, the rx-udp-gro-forwarding author here.
> >>
> >> (good thing this appeared on IWL, which I read time to time, but please
> >> Cc netdev next time)
> >> (thus +Cc Jakub, Eric, and netdev)
> >
> > Well, two things, it seems like rx-udp-gro-forwarding accelerates it
> > but the issue is actually in: rx-gro-list
>
> Do you enable them simultaneously? I remember, when I was adding
> gro-fwd, it was working (and working good) as follows:
>
> 1. gro-fwd on, gro-list off: gro-fwd
> 2. gro-fwd off, gro-list on: gro-list
> 3. gro-fwd on, gro-list on: gro-list
>
> Note that their receive paths are independent[0]: skb_gro_receive_list()
> vs skb_gro_receive(), thus I'm still not really sure how gro-fwd can
> trigger gro-list's bug.

Neither am I... I have enabled sol via ipmitool now, will try to get a
better capture

> > And since i've only been able to trigger it in ixgbe i thought it
> > might be a driver issue =)
>
> Your screenshot says "__udp_gso_segment", which means that the
> problematic UDP GRO packet hits the Tx path. Rx is in general
> driver-independent. Tx has separate netdev feature ("tx-gso-list"), but
> it's not supported by any driver, just software stack. It might be that
> your traffic goes through a bridge or tunnel or anything else that
> triggers GSO and software segmentation then booms for some reason.
> BTW, __udp_gso_segment() is one-liner when the passed skb was
> gro-listed[1], so having it in the bug splat could mean the skb didn't
> take that route. But hard to say with no full stacktrace.

I do have a UDP tunnel, in wireguard, will disable it.

Beyond that some bridges and veth interfaces, but lets wait for a full trace

> [...]
>
> >>>> But correlating that with the source is beyond me, it could be generic
> >>>> but i thought i'd send it you first since it's part of the redhat
> >>>> guide to speeding up udp traffic
> >> [0]
> >> https://lore.kernel.org/netdev/f83d79d6-f8d7-a229-941a-7d7427975160@xxxxxxxxxx
> >>
> >> Thanks,
> >> Olek
>
> [0]
> https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L518
> [1]
> https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L277
>
> Thanks,
> Olek