Re: [PATCH v2 2/2] mwifiex: pcie: add reset_d3cold quirk for Surface gen4+ devices

From: Maximilian Luz
Date: Fri Jul 09 2021 - 15:28:00 EST


On 7/9/21 8:44 PM, Pali Rohár wrote:

[...]

My (very) quick attempt ('echo 1 > /sys/bus/pci/.../reset) at
reproducing this didn't work, so I think at very least a network
connection needs to be active.

This is doing PCIe function level reset. Maybe you can get more luck
with PCIe Hot Reset. See following link how to trigger PCIe Hot Reset
from userspace: https://alexforencich.com/wiki/en/pcie/hot-reset-linux

Thanks for that link! That does indeed do something which breaks the
adapter. Running the script produces

[ 178.388414] mwifiex_pcie 0000:01:00.0: PREP_CMD: card is removed
[ 178.389128] mwifiex_pcie 0000:01:00.0: PREP_CMD: card is removed
[ 178.461365] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
[ 178.461373] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done
[ 178.984106] pci 0000:01:00.0: [11ab:2b38] type 00 class 0x020000
[ 178.984161] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit pref]
[ 178.984193] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x000fffff 64bit pref]
[ 178.984430] pci 0000:01:00.0: supports D1 D2
[ 178.984434] pci 0000:01:00.0: PME# supported from D0 D1 D3hot D3cold
[ 178.984871] pcieport 0000:00:1c.0: ASPM: current common clock configuration is inconsistent, reconfiguring
[ 179.297919] pci 0000:01:00.0: BAR 0: assigned [mem 0xd4400000-0xd44fffff 64bit pref]
[ 179.297961] pci 0000:01:00.0: BAR 2: assigned [mem 0xd4500000-0xd45fffff 64bit pref]
[ 179.298316] mwifiex_pcie 0000:01:00.0: enabling device (0000 -> 0002)
[ 179.298752] mwifiex_pcie: PCI memory map Virt0: 00000000c4593df1 PCI memory map Virt2: 0000000039d67daf
[ 179.300522] mwifiex_pcie 0000:01:00.0: WLAN read winner status failed!
[ 179.300552] mwifiex_pcie 0000:01:00.0: info: _mwifiex_fw_dpc: unregister device
[ 179.300622] mwifiex_pcie 0000:01:00.0: Read register failed
[ 179.300912] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
[ 179.300928] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done

after which the card is unusable (there is no WiFi interface availabel
any more). Reloading the driver module doesn't help and produces

[ 376.906833] mwifiex_pcie: PCI memory map Virt0: 0000000025149d28 PCI memory map Virt2: 00000000c4593df1
[ 376.907278] mwifiex_pcie 0000:01:00.0: WLAN read winner status failed!
[ 376.907281] mwifiex_pcie 0000:01:00.0: info: _mwifiex_fw_dpc: unregister device
[ 376.907293] mwifiex_pcie 0000:01:00.0: Read register failed
[ 376.907404] mwifiex_pcie 0000:01:00.0: performing cancel_work_sync()...
[ 376.907406] mwifiex_pcie 0000:01:00.0: cancel_work_sync() done

again. Performing a function level reset produces

[ 402.489572] mwifiex_pcie 0000:01:00.0: mwifiex_pcie_reset_prepare: adapter structure is not valid
[ 403.514219] mwifiex_pcie 0000:01:00.0: mwifiex_pcie_reset_done: adapter structure is not valid

and doesn't help either.

Running the same command on a kernel with (among other) this patch
unfortunately also breaks the adapter in the same way. As far as I can
tell though, it doesn't run through the reset code added by this patch
(as indicated by the log message when performing FLR), which I assume
in a non-forced scenario, e.g. firmware issues (which IIRC is why this
patch exists), it would?

Unfortunately I can't test that with a
network connection (and without compiling a custom kernel for which I
don't have the time right now) because there's currently another bug
deadlocking on device removal if there's an active connection during
removal (which also seems to trigger on reset). That one ill be fixed
by

https://lore.kernel.org/linux-wireless/20210515024227.2159311-1-briannorris@xxxxxxxxxxxx/

Jonas might know more.

[...]