Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

From: Paolo Abeni
Date: Mon Jun 26 2023 - 13:57:10 EST


On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> There, that didn't take long, even with wireguard disabled
>
> [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> [14079.685456] #PF: supervisor read access in kernel mode
> [14079.690686] #PF: error_code(0x0000) - not-present page
> [14079.695915] PGD 0 P4D 0
> [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14079.804408] Call Trace:
> [14079.806961] <TASK>
> [14079.809170] ? __die+0x1a/0x60
> [14079.812340] ? page_fault_oops+0x158/0x440
> [14079.816551] ? ip6_route_output_flags+0xe3/0x160
> [14079.821284] ? exc_page_fault+0x3f4/0x820
> [14079.825408] ? update_load_avg+0x77/0x710
> [14079.829534] ? asm_exc_page_fault+0x22/0x30
> [14079.833836] ? __udp_gso_segment+0x346/0x4f0
> [14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
> [14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
> [14079.847679] ? try_to_wake_up+0x8e/0x5a0
> [14079.851713] inet_gso_segment+0x150/0x3c0
> [14079.855827] ? vhost_poll_wakeup+0x31/0x40
> [14079.860032] skb_mac_gso_segment+0x9b/0x110
> [14079.864331] __skb_gso_segment+0xae/0x160
> [14079.868455] ? netif_skb_features+0x144/0x290
> [14079.872928] validate_xmit_skb+0x167/0x370
> [14079.877139] validate_xmit_skb_list+0x43/0x70
> [14079.881612] sch_direct_xmit+0x267/0x380
> [14079.885641] __qdisc_run+0x140/0x590
> [14079.889324] __dev_queue_xmit+0x44d/0xba0
> [14079.893450] ? nf_hook_slow+0x3c/0xb0
> [14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
> [14079.901788] maybe_deliver+0xa9/0x100
> [14079.905564] br_flood+0x8a/0x180
> [14079.908903] br_handle_frame_finish+0x31f/0x5b0
> [14079.913547] br_handle_frame+0x28f/0x3a0
> [14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
> [14079.921622] ? br_handle_local_finish+0x20/0x20
> [14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> [14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
> [14079.936946] ? ___slab_alloc+0x4bf/0xaf0
> [14079.940986] __netif_receive_skb_list_core+0x107/0x250
> [14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
> [14079.951660] ? napi_gro_flush+0x97/0xf0
> [14079.955604] napi_complete_done+0x69/0x180
> [14079.959808] ixgbe_poll+0xe10/0x12e0
> [14079.963506] __napi_poll+0x26/0x1b0
> [14079.967106] napi_threaded_poll+0x232/0x250
> [14079.971405] ? __napi_poll+0x1b0/0x1b0
> [14079.975260] kthread+0xee/0x120
> [14079.978510] ? kthread_complete_and_exit+0x20/0x20
> [14079.983415] ret_from_fork+0x22/0x30
> [14079.987102] </TASK>
> [14079.989395] Modules linked in: chaoskey
> [14079.993347] CR2: 00000000000000c0
> [14079.996773] ---[ end trace 0000000000000000 ]---
> [14080.018013] pstore: backend (erst) writing error (-28)
> [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> interrupt ]---

Could you please provide a decoded stack trace?

# in your git tree:
cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux

Thanks!

Paolo