Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

From: Ian Kumlien
Date: Tue Jun 27 2023 - 08:31:33 EST


On Tue, Jun 27, 2023 at 11:19 AM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
>
> On Mon, 2023-06-26 at 20:59 +0200, Ian Kumlien wrote:
> > On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote:
> > >
> > > Nevermind, I think I found it, I will loop this thing until I have a
> > > proper trace....
> >
> > Still some question marks, but much better
>
> Thanks!
> >
> > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > [ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > [ 62.631083] #PF: supervisor read access in kernel mode
> > [ 62.636312] #PF: error_code(0x0000) - not-present page
> > [ 62.641541] PGD 0 P4D 0
> > [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
> > [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [ 62.663344] RIP: 0010:__udp_gso_segment
> > (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
> > net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
> > net/ipv4/udp_offload.c:277)
>
> So it's faulting here:
>
> static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs)
> {
> struct sk_buff *seg;
> struct udphdr *uh, *uh2;
> struct iphdr *iph, *iph2;
>
> seg = segs;
> uh = udp_hdr(seg);
> iph = ip_hdr(seg);
>
> if ((udp_hdr(seg)->dest == udp_hdr(seg->next)->dest) &&
> // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> The GSO segment has been assembled by skb_gro_receive_list()
> I guess seg->next is NULL, which is somewhat unexpected as
> napi_gro_complete() clears the gso_size when sending up the stack a
> single frame.
>
> On the flip side, AFAICS, nothing prevents the stack from changing the
> aggregated packet layout (e.g. pulling data and/or linearizing the
> skb).
>
> In any case this looks more related to rx-gro-list then rx-udp-gro-
> forwarding. I understand you have both feature enabled in your env?
>
> Side questions: do you have any non trivial nf/br filter rule?
>
> The following could possibly validate the above and avoid the issue,
> but it's a bit papering over it. Could you please try it in your env?

Will do as soon as i get home =)

> Thanks!
>
> Paolo
> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..75531686bfdf 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4319,6 +4319,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb->prev = tail;
>
> + if (WARN_ON_ONCE(!skb->next))
> + goto err_linearize;
> +
> if (skb_needs_linearize(skb, features) &&
> __skb_linearize(skb))
> goto err_linearize;
>