Re: [5.12 - 5.15] xHCI controller dead - not renesas but intel

From: Norbert Preining
Date: Wed Nov 17 2021 - 20:16:23 EST


Hi Mathias, hi all,

(please cc, thanks)

> Patch in link below resolved another case with similar log.
> Does it help in your case?

Unfortunately not, happened again after redirecting a web cam to a
virtual machine, then removing the redirection and shutting down the VM.
After that, boom.
Nov 18 09:49:37 bulldog systemd-machined[1806]: Machine qemu-3-FujitsuWin10 terminated.
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: Abort failed to stop command ring: -110
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
Nov 18 09:49:43 bulldog kernel: xhci_hcd 0000:00:14.0: USBSTS: HCHalted EINT

After that, I logged into the machine via ssh, and send unbind / bind
requests:
echo -n "0000:14:00.0" > /sys/bus/pci/drivers/xhci_hcd/unbind
gave
Nov 18 09:54:48 bulldog kernel: xhci_hcd 0000:00:14.0: remove, state 4
Nov 18 09:54:48 bulldog kernel: usb usb2: USB disconnect, device number 1
Nov 18 09:54:48 bulldog kernel: xhci_hcd 0000:00:14.0: USB bus 2 deregistered
Nov 18 09:54:48 bulldog kernel: xhci_hcd 0000:00:14.0: remove, state 1
Nov 18 09:54:48 bulldog kernel: usb usb1: USB disconnect, device number 1
Nov 18 09:54:48 bulldog kernel: xhci_hcd 0000:00:14.0: USB bus 1 deregistered

and then
echo -n "0000:14:00.0" > /sys/bus/pci/drivers/xhci_hcd/bind
gave
Nov 18 09:55:00 bulldog kernel: xhci_hcd 0000:00:14.0: xHCI Host Controller
Nov 18 09:55:00 bulldog kernel: xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
Nov 18 09:55:00 bulldog kernel: xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000001109810
...

then it adds a few devices without any problem, until it comes to here:
Nov 18 09:55:01 bulldog kernel: usb 1-11: new high-speed USB device number 4 using xhci_hcd
Nov 18 09:55:01 bulldog kernel: usb 1-11: New USB device found, idVendor=2109, idProduct=2817, bcdDevice= 6.33
Nov 18 09:55:01 bulldog kernel: usb 1-11: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Nov 18 09:55:01 bulldog kernel: usb 1-11: Product: USB2.0 Hub
Nov 18 09:55:01 bulldog kernel: usb 1-11: Manufacturer: VIA Labs, Inc.
Nov 18 09:55:01 bulldog kernel: hub 1-11:1.0: USB hub found
Nov 18 09:55:01 bulldog kernel: hub 1-11:1.0: 4 ports detected

I think this is my monitor (Fujitsu) with integrated USB hub.
There mouse and kbd are connected, but while doing some stuff it departs
again into dead land:
Nov 18 09:55:01 bulldog kernel: usb 1-12: new high-speed USB device number 5 using xhci_hcd
Nov 18 09:55:07 bulldog kernel: usb 1-12: device descriptor read/64, error -110
Nov 18 09:55:22 bulldog kernel: usb 1-12: device descriptor read/64, error -110
Nov 18 09:55:22 bulldog kernel: usb 1-11.1: new high-speed USB device number 6 using xhci_hcd
Nov 18 09:55:23 bulldog kernel: usb 1-11.1: New USB device found, idVendor=2109, idProduct=2817, bcdDevice= 6.23
Nov 18 09:55:23 bulldog kernel: usb 1-11.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Nov 18 09:55:23 bulldog kernel: usb 1-11.1: Product: USB2.0 Hub
Nov 18 09:55:23 bulldog kernel: usb 1-11.1: Manufacturer: VIA Labs, Inc.
Nov 18 09:55:23 bulldog kernel: hub 1-11.1:1.0: USB hub found
Nov 18 09:55:23 bulldog kernel: hub 1-11.1:1.0: 4 ports detected
Nov 18 09:55:23 bulldog kernel: usb 1-12: new high-speed USB device number 7 using xhci_hcd
Nov 18 09:55:28 bulldog kernel: usb 1-12: device descriptor read/64, error -110
Nov 18 09:55:44 bulldog kernel: usb 1-12: device descriptor read/64, error -110
Nov 18 09:55:44 bulldog kernel: usb usb1-port12: attempt power cycle
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: new full-speed USB device number 8 using xhci_hcd
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: device descriptor read/64, error -32
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: New USB device found, idVendor=062a, idProduct=4102, bcdDevice= 1.03
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: Product: 2.4G Wireless Mouse
Nov 18 09:55:44 bulldog kernel: usb 1-11.2: Manufacturer: MOSART Semi.
Nov 18 09:55:44 bulldog kernel: input: MOSART Semi. 2.4G Wireless Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.2/1-11.2:1.0/>
Nov 18 09:55:44 bulldog kernel: input: MOSART Semi. 2.4G Wireless Mouse Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1>
Nov 18 09:55:44 bulldog kernel: input: MOSART Semi. 2.4G Wireless Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.2/1-11.2:1.0/>
Nov 18 09:55:44 bulldog kernel: hid-generic 0003:062A:4102.000A: input,hiddev1,hidraw2: USB HID v1.10 Mouse [MOSART Semi. 2.4G Wireless >
Nov 18 09:55:44 bulldog kernel: usb 1-11.1.2: new high-speed USB device number 9 using xhci_hcd
Nov 18 09:55:45 bulldog kernel: usb 1-11.1.2: New USB device found, idVendor=0bda, idProduct=8153, bcdDevice=31.01
Nov 18 09:55:45 bulldog kernel: usb 1-11.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=6
Nov 18 09:55:45 bulldog kernel: usb 1-11.1.2: Product: USB 10/100/1000 LAN
Nov 18 09:55:45 bulldog kernel: usb 1-11.1.2: Manufacturer: Realtek
Nov 18 09:55:45 bulldog kernel: usb 1-11.1.2: SerialNumber: 101000001
Nov 18 09:55:45 bulldog kernel: usb 1-12: new high-speed USB device number 10 using xhci_hcd
Nov 18 09:55:55 bulldog kernel: xhci_hcd 0000:00:14.0: Abort failed to stop command ring: -110
Nov 18 09:55:55 bulldog kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
Nov 18 09:55:55 bulldog kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
Nov 18 09:55:55 bulldog kernel: xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
Nov 18 09:55:56 bulldog kernel: usb 1-12: device not accepting address 10, error -108
Nov 18 09:55:56 bulldog kernel: usb usb1-port12: couldn't allocate usb_device
Nov 18 09:55:56 bulldog kernel: usb 1-3: USB disconnect, device number 2
Nov 18 09:55:56 bulldog kernel: usb 1-11-port4: couldn't allocate usb_device
Nov 18 09:55:56 bulldog kernel: usb 1-11.1-port3: couldn't allocate usb_device
Nov 18 09:55:56 bulldog kernel: usb 1-7: USB disconnect, device number 3
Nov 18 09:55:56 bulldog kernel: usb 1-11: USB disconnect, device number 4
Nov 18 09:55:56 bulldog kernel: usb 1-11.1: USB disconnect, device number 6
Nov 18 09:55:56 bulldog kernel: usb 1-11.1.2: USB disconnect, device number 9
Nov 18 09:55:56 bulldog kernel: usb 1-11.2: USB disconnect, device number 8


More information: when shutting down the system, it failed to turn of
the computer (systemctl halt), but screen went off, all services
stopped, and at the end the machine was just brrmmming away with me not
having an idea what it is doing.

Hard reset.

First reboot gave no useable mouse/kbd.

Shutdown/wait/boot - all back to normal.

Any suggestion? I compile all my kernels (currently 5.15.2), so any
patch can be tested.

(Please cc, thanks)

Thanks a lot

Norbert

--
PREINING Norbert https://www.preining.info
Fujitsu Research + IFMGA Guide + TU Wien + TeX Live + Debian Dev
GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13