Re: [PATCH] usb: core: hub: hub_port_init lock controller instead of bus

From: Mathias Nyman
Date: Wed Feb 10 2016 - 12:08:30 EST


On 08.02.2016 15:49, Chris Bainbridge wrote:
The XHCI controller presents two USB buses to the system - one for USB 2
and one for USB 3. When only one bus is locked there is a race condition
with two threads in hub_port_init:

[ 8.984500] Call Trace:
[ 8.985698] [<ffffffff81b9bab7>] schedule+0x37/0x90
[ 8.986918] [<ffffffff817da7cd>] usb_kill_urb+0x8d/0xd0
[ 8.988130] [<ffffffff8111e5e0>] ? wake_up_atomic_t+0x30/0x30
[ 8.989343] [<ffffffff817dafbe>] usb_start_wait_urb+0xbe/0x150
[ 8.990561] [<ffffffff817db10c>] usb_control_msg+0xbc/0xf0
[ 8.991766] [<ffffffff817d07de>] hub_port_init+0x51e/0xb70
[ 8.992964] [<ffffffff817d4697>] hub_event+0x817/0x1570
[ 8.994156] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 8.995342] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 8.996528] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 8.997707] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 8.998883] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.000056] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.001241] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.002420] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

[ 9.870837] Call Trace:
[ 9.875664] [<ffffffff817fd36d>] xhci_setup_device+0x53d/0xa40
[ 9.876871] [<ffffffff817fd87e>] xhci_address_device+0xe/0x10
[ 9.878068] [<ffffffff817d047f>] hub_port_init+0x1bf/0xb70
[ 9.879262] [<ffffffff811247ed>] ? trace_hardirqs_on+0xd/0x10
[ 9.880465] [<ffffffff817d4697>] hub_event+0x817/0x1570
[ 9.881668] [<ffffffff810f3e6f>] process_one_work+0x1ff/0x620
[ 9.882869] [<ffffffff810f3dcf>] ? process_one_work+0x15f/0x620
[ 9.884074] [<ffffffff810f4684>] worker_thread+0x64/0x4b0
[ 9.885268] [<ffffffff810f4620>] ? rescuer_thread+0x390/0x390
[ 9.886457] [<ffffffff810fa7f5>] kthread+0x105/0x120
[ 9.887634] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200
[ 9.888817] [<ffffffff81ba183f>] ret_from_fork+0x3f/0x70
[ 9.889995] [<ffffffff810fa6f0>] ? kthread_create_on_node+0x200/0x200

Which results from the two call chains:

hub_port_init
usb_get_device_descriptor
usb_get_descriptor
usb_control_msg
usb_internal_control_msg
usb_start_wait_urb
usb_submit_urb / wait_for_completion_timeout / usb_kill_urb

hub_port_init
hub_set_address
xhci_address_device
xhci_setup_device

The logged kernel errors are:

[ 8.034843] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[ 13.183701] usb 3-3: device descriptor read/all, error -110

On a test system this failure occurred on 6% of all boots.

Hypothetically, it should perhaps be possible to set the address of the
hub on one bus while the hub on the other bus already has an outstanding
descriptor read request. But as this is not working reliably, fix it by
locking the controller in hub_port_init to prevent parallel
initialisation of both buses.


Most likely xhci is messed up after two device slots are in default state at the same time.
This happens when both threads are in hub_port_init() have called hub_port_reset()

The issue becomes visible when the the descriptor read and set address both fail after
the port resets.

xhci specs 4.5.3 has one tiny note about this:
"Note: Software shall not transition more than one Device Slot to the Default State at a time"

So to me, and from xhci pov this patch looks like the correct solution,
but I might be missing some usb core side details.

-Mathias