On Wed, Dec 30, 2020 at 07:53:17PM +0100, Michael Walle wrote:
The Intel i210 doesn't work if the Expansion ROM BAR overlaps with
another BAR. Networking won't work at all and once a packet is sent the
netdev watchdog will bite:
1) Is this a regression? It sounds like you don't know for sure
because earlier kernels don't support your platform.
2) Can you open a bugzilla at https://bugzilla.kernel.org and attach
the complete dmesg and "sudo lspci -vv" output? I want to see whether
Linux is assigning something incorrectly or this is a consequence of
some firmware initialization.
3) If the Intel i210 is defective in how it handles an Expansion ROM
that overlaps another BAR, a quirk might be the right fix. But my
guess is the device is working correctly per spec and there's
something wrong in how firmware/Linux is assigning things. That would
mean we need a more generic fix that's not a quirk and not tied to the
Intel i210.
[ 89.059374] ------------[ cut here ]------------
[ 89.064019] NETDEV WATCHDOG: enP2p1s0 (igb): transmit queue 0 timed out
[ 89.070681] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x3a8/0x3b0
[ 89.078989] Modules linked in:
[ 89.082053] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 5.11.0-rc1-00020-gc16f033804b #289
[ 89.091574] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
[ 89.099870] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 89.105900] pc : dev_watchdog+0x3a8/0x3b0
[ 89.109923] lr : dev_watchdog+0x3a8/0x3b0
[ 89.113945] sp : ffff80001000bd50
[ 89.117268] x29: ffff80001000bd50 x28: 0000000000000008
[ 89.122602] x27: 0000000000000004 x26: 0000000000000140
[ 89.127935] x25: ffff002001c6c000 x24: ffff002001c2b940
[ 89.133267] x23: ffff8000118c7000 x22: ffff002001c6c39c
[ 89.138600] x21: ffff002001c6bfb8 x20: ffff002001c6c3b8
[ 89.143932] x19: 0000000000000000 x18: 0000000000000010
[ 89.149264] x17: 0000000000000000 x16: 0000000000000000
[ 89.154596] x15: ffffffffffffffff x14: 0720072007200720
[ 89.159928] x13: 0720072007740775 x12: ffff80001195b980
[ 89.165260] x11: 0000000000000003 x10: ffff800011943940
[ 89.170592] x9 : ffff800010100d44 x8 : 0000000000017fe8
[ 89.175924] x7 : c0000000ffffefff x6 : 0000000000000001
[ 89.181255] x5 : 0000000000000000 x4 : 0000000000000000
[ 89.186587] x3 : 00000000ffffffff x2 : ffff8000118eb908
[ 89.191919] x1 : 84d8200845006900 x0 : 0000000000000000
[ 89.197251] Call trace:
[ 89.199701] dev_watchdog+0x3a8/0x3b0
[ 89.203374] call_timer_fn+0x38/0x208
[ 89.207049] run_timer_softirq+0x290/0x540
[ 89.211158] __do_softirq+0x138/0x404
[ 89.214831] irq_exit+0xe8/0xf8
[ 89.217981] __handle_domain_irq+0x70/0xc8
[ 89.222091] gic_handle_irq+0xc8/0x2b0
[ 89.225850] el1_irq+0xb8/0x180
[ 89.228999] arch_cpu_idle+0x18/0x40
[ 89.232587] default_idle_call+0x70/0x214
[ 89.236610] do_idle+0x21c/0x290
[ 89.239848] cpu_startup_entry+0x2c/0x70
[ 89.243783] secondary_start_kernel+0x1a0/0x1f0
[ 89.248332] ---[ end trace 1687af62576397bc ]---
[ 89.253350] igb 0002:01:00.0 enP2p1s0: Reset adapter
This entire splat is overkill. The useful part is what somebody who
trips over this might google for. Strip out the "cut here", the
timestamps, the register dump, and the last 6-8 lines of the call
trace.