Re: Kernel panic in netif_rx_internal after v6 pings between netns

From: Jakub Kicinski
Date: Mon Jan 22 2024 - 13:06:48 EST


On Sat, 20 Jan 2024 18:53:50 +0100 Matthieu Baerts wrote:
> FYI, I managed to find a commit that seems to be causing the issue:
>
> 8e791f7eba4c ("x86/kprobes: Drop removed INT3 handling code")
>
> It is not clear why, but if I revert it, I can no longer reproduce the
> issue. I reported the issue to the patch's author and the x86's ML:
>
> https://lore.kernel.org/r/06cb540e-34ff-4dcd-b936-19d4d14378c9@xxxxxxxxxx
>
> Thank you again for your help.

Hi Matthieu!

Somewhat related. What do you do currently to ignore crashes?
I was seeing a lot of:
https://netdev-2.bots.linux.dev/vmksft-net-mp/results/431181/vm-crash-thr0-2

So I hacked up this function to filter the crash from NIPA CI:
https://github.com/kuba-moo/nipa/blob/master/contest/remote/lib/vm.py#L50
It tries to get first 5 function names from the stack, to form
a "fingerprint". But I seem to recall a discussion at LPC's testing
track that there are existing solutions for generating fingerprints.
Are you aware of any?

(FWIW the crash from above seems to be gone on latest linux.git,
this night's CIs run are crash-free.)