Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

From: Paolo Abeni
Date: Tue Jun 27 2023 - 05:20:04 EST


On Mon, 2023-06-26 at 20:59 +0200, Ian Kumlien wrote:
> On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <ian.kumlien@xxxxxxxxx> wrote:
> >
> > Nevermind, I think I found it, I will loop this thing until I have a
> > proper trace....
>
> Still some question marks, but much better

Thanks!
>
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> [ 62.631083] #PF: supervisor read access in kernel mode
> [ 62.636312] #PF: error_code(0x0000) - not-present page
> [ 62.641541] PGD 0 P4D 0
> [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
> [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [ 62.663344] RIP: 0010:__udp_gso_segment
> (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
> net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
> net/ipv4/udp_offload.c:277)

So it's faulting here:

static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs)
{
struct sk_buff *seg;
struct udphdr *uh, *uh2;
struct iphdr *iph, *iph2;

seg = segs;
uh = udp_hdr(seg);
iph = ip_hdr(seg);

if ((udp_hdr(seg)->dest == udp_hdr(seg->next)->dest) &&
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The GSO segment has been assembled by skb_gro_receive_list()
I guess seg->next is NULL, which is somewhat unexpected as
napi_gro_complete() clears the gso_size when sending up the stack a
single frame.

On the flip side, AFAICS, nothing prevents the stack from changing the
aggregated packet layout (e.g. pulling data and/or linearizing the
skb).

In any case this looks more related to rx-gro-list then rx-udp-gro-
forwarding. I understand you have both feature enabled in your env?

Side questions: do you have any non trivial nf/br filter rule?

The following could possibly validate the above and avoid the issue,
but it's a bit papering over it. Could you please try it in your env?

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..75531686bfdf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4319,6 +4319,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb->prev = tail;

+ if (WARN_ON_ONCE(!skb->next))
+ goto err_linearize;
+
if (skb_needs_linearize(skb, features) &&
__skb_linearize(skb))
goto err_linearize;