Re: [PATCH v2] usb:xhci fix panic in xhci_free_virt_devices_depth_first

From: Greg KH
Date: Mon Nov 06 2017 - 06:32:38 EST


On Mon, Nov 06, 2017 at 06:03:08PM +0800, Chen Yu wrote:
> Hi,
>
> On 2017/11/6 16:31, Greg KH wrote:
> > On Mon, Nov 06, 2017 at 04:20:23PM +0800, Yu Chen wrote:
> >> From: Yu Chen <chenyu56@xxxxxxxxxx>
> >>
> >> Check vdev->real_port 0 to avoid panic
> >> [ 9.261347] [<ffffff800884a390>] xhci_free_virt_devices_depth_first+0x58/0x108
> >> [ 9.261352] [<ffffff800884a814>] xhci_mem_cleanup+0x1bc/0x570
> >> [ 9.261355] [<ffffff8008842de8>] xhci_stop+0x140/0x1c8
> >> [ 9.261365] [<ffffff80087ed304>] usb_remove_hcd+0xfc/0x1d0
> >> [ 9.261369] [<ffffff80088551c4>] xhci_plat_remove+0x6c/0xa8
> >> [ 9.261377] [<ffffff80086e928c>] platform_drv_remove+0x2c/0x70
> >> [ 9.261384] [<ffffff80086e6ea0>] __device_release_driver+0x80/0x108
> >> [ 9.261387] [<ffffff80086e7a1c>] device_release_driver+0x2c/0x40
> >> [ 9.261392] [<ffffff80086e5f28>] bus_remove_device+0xe0/0x120
> >> [ 9.261396] [<ffffff80086e2e34>] device_del+0x114/0x210
> >> [ 9.261399] [<ffffff80086e9e00>] platform_device_del+0x30/0xa0
> >> [ 9.261403] [<ffffff8008810bdc>] dwc3_otg_work+0x204/0x488
> >> [ 9.261407] [<ffffff80088133fc>] event_work+0x304/0x5b8
> >> [ 9.261414] [<ffffff80080e31b0>] process_one_work+0x148/0x490
> >> [ 9.261417] [<ffffff80080e3548>] worker_thread+0x50/0x4a0
> >> [ 9.261421] [<ffffff80080e9ea0>] kthread+0xe8/0x100
> >> [ 9.261427] [<ffffff8008083680>] ret_from_fork+0x10/0x50
> >>
> >> The problem can occur if xhci_plat_remove() is called shortly after
> >> xhci_plat_probe(). While xhci_free_virt_devices_depth_first been
> >> called before the device has been setup and get real_port initialized.
> >> The problem occurred on Hikey960 and was reproduced by Guenter Roeck
> >> on Kevin with chromeos-4.4.
> >>
> >> Cc: Guenter Roeck <groeck@xxxxxxxxxx>
> >> Signed-off-by: Fan Ning <fanning4@xxxxxxxxxxxxx>
> >> Signed-off-by: Li Rui <lirui39@xxxxxxxxxxxxx>
> >> Signed-off-by: yangdi <yangdi10@xxxxxxxxxxxxx>
> >> Signed-off-by: Yu Chen <chenyu56@xxxxxxxxxx>
> >>
> >> ---
> >> drivers/usb/host/xhci-mem.c | 4 ++++
> >> 1 file changed, 4 insertions(+)
> >>
> >> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> >> index 2a82c927ded2..0361b4a58f59 100644
> >> --- a/drivers/usb/host/xhci-mem.c
> >> +++ b/drivers/usb/host/xhci-mem.c
> >> @@ -947,6 +947,9 @@ void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_id)
> >> if (!vdev)
> >> return;
> >>
> >> + if (WARN_ON(!vdev->real_port))
> >
> > Ok, now you are sending a lot of mess to the kernel log, so what can a
> > user do about it?
> >
> > How can this ever happen? Is it a hardware error, or a kernel driver
> > logic error?
> >
> > thanks,
> >
> > greg k-h
> >
> > .
> >
>
> The problem is a driver logic error, it can reproduced if xhci_plat_remove() is
> called shortly after xhci_plat_probe() while xhci_alloc_virt_device has been called
> but real_port has not been initialized in xhci_setup_addressable_virt_dev.

Who is calling xhci_plat_remove() like this?

> A simple process is as below:
> xhci_plat_probe()
> |
> usb_add_hcd() xhci_plat_remove()
> | |
> find some device usb_remove_hcd()
> | |
> hub_port_connect() -> usb_alloc_dev() usb_disconnect()
> | |
> before hub_enable_device() xhci_stop()
> |
> xhci_mem_cleanup()
> |
> xhci_free_virt_devices_depth_first()
> |
> real_port is 0 access xhci->rh_bw[vdev->real_port-1]
>
> The problem came from https://bugs.96boards.org/show_bug.cgi?id=535
> Also look at crbug.com/700041

Then the bug needs to be fixed, throwing a huge kernel trace message
into the kernel log is not "fixing" the problem at all, right?

thanks,

greg k-h