Re: [PATCH] USB: Fix race condition during UVC webcam disconnect

From: Alan Stern
Date: Thu Jul 20 2023 - 10:56:04 EST


On Thu, Jul 20, 2023 at 05:01:42PM +0530, Aman Deep wrote:
> In the bug happened during uvc webcam disconect,there is race
> between stopping a video transfer and usb disconnect.This issue is
> reproduced in my system running Linux kernel when UVC webcam play is
> stopped and UVC webcam is disconnected at the same time. This causes the
> below backtrace:
>
> [2-3496.7275] PC is at 0xbf418000+0x2d8 [usbcore]
> [2-3496.7275] LR is at 0x00000005
> [2-3496.7275] pc : [<bf4182d8>]((usb_ifnum_to_if
> </drivers/usb/core/usb.c:283
> [usbcore.ko]>)) lr : [<00000005>]() psr: 20000013
> [2-3496.7275] Function entered at [<bf4182a4>]((usb_ifnum_to_if
> </drivers/usb/core/usb.c:275
> [usbcore.ko]>)) (0xbf418000+0x2a4 [usbcore]) from
> [<bf423974>]((usb_hcd_alloc_bandwidth
> </drivers/usb/core/hcd.c:1947
> [usbcore.ko]>)) (0xbf418000+0xb974 [usbcore])
> [2-3496.7275] Function entered at [<bf423738>]((usb_hcd_alloc_bandwidth
> </drivers/usb/core/hcd.c:1876
> [usbcore.ko]>)) (0xbf418000+0xb738 [usbcore]) from
> [<bf426ca0>]((usb_set_interface
> </drivers/usb/core/message.c:1461
> [usbcore.ko]>)) (0xbf418000+0xeca0 [usbcore])
> [2-3496.7275] Function entered at [<bf426b9c>]((usb_set_interface
> </drivers/usb/core/message.c:1385
> [usbcore.ko]>)) (0xbf418000+0xeb9c [usbcore]) from
> [<bf9c4dd4>]((uvc_video_clock_cleanup
> </drivers/media/usb/uvc/uvc_video.c:598
> uvc_video_stop_streaming
> </drivers/media/usb/uvc/uvc_video.c:2221
> [uvcvideo.ko]>)) (0xbf9bd000+0x7dd4 [uvcvideo])
> [2-3496.7275] Function entered at [<bf9c4d98>]((uvc_video_stop_streaming
> </drivers/media/usb/uvc/uvc_video.c:2200
> [uvcvideo.ko]>)) (0xbf9bd000+0x7d98 [uvcvideo]) from
> [<bf9bfab8>]((spin_lock_irq
> </include/linux/spinlock.h:363
> uvc_stop_streaming
> </drivers/media/usb/uvc/uvc_queue.c:194
> [uvcvideo.ko]>)) (0xbf9bd000+0x2ab8 [uvcvideo])
> [2-3496.7276] Function entered at [<bf9bfa94>]((uvc_stop_streaming
> </drivers/media/usb/uvc/uvc_queue.c:186
> [uvcvideo.ko]>)) (0xbf9bd000+0x2a94 [uvcvideo]) from
> [<be307150>]((__read_once_size
> </include/linux/compiler.h:290
> (discriminator 1) __vb2_queue_cancel
> </drivers/media/common/videobuf2/videobuf2-core.c:1893
> (discriminator 1) [videobuf2_common.ko]>)) (0xbe306000+0x1150
> [videobuf2_common])
> [2-3496.7276] Function entered at [<be307120>]((__vb2_queue_cancel
> </drivers/media/common/videobuf2/videobuf2-core.c:1877
> [videobuf2_common.ko]>)) (0xbe306000+0x1120 [videobuf2_common]) from
> [<be308894>]((vb2_core_streamoff
> </drivers/media/common/videobuf2/videobuf2-core.c:2053

This backtrace doesn't show what actually caused the bug. You should
have included several lines from the system log _preceding_ the
backtrace. Without those lines, we can only guess at what the problem
was.

> This below solution patch fixes this race condition at USB core level
> occurring during UVC webcam device disconnect.

How can a race in the UVC driver be fixed by changing the USB core? It
seems obvious that the only way to fix such a race is by changing the
UVC driver.

> Signed-off-by: Anuj Gupta <anuj01.gupta@xxxxxxxxxxx>
> Signed-off-by: Aman Deep <aman.deep@xxxxxxxxxxx>
> ---
> drivers/usb/core/hcd.c | 7 ++++++-
> drivers/usb/core/message.c | 4 ++++
> drivers/usb/core/usb.c | 9 ++++++---
> 3 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 8300baedafd2..a06452cbbaa4 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1931,7 +1931,12 @@ int usb_hcd_alloc_bandwidth(struct usb_device *udev,
> }
> }
> if (cur_alt && new_alt) {
> - struct usb_interface *iface = usb_ifnum_to_if(udev,
> + struct usb_interface *iface;
> +
> + if (udev->state == USB_STATE_NOTATTACHED)
> + return -ENODEV;

What will happen if the state changes to USB_STATE_NOTATTACHED at this
point, after that test was made?

> +
> + iface = usb_ifnum_to_if(udev,
> cur_alt->desc.bInterfaceNumber);
>
> if (!iface)
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index b5811620f1de..f31c7287dc01 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1575,7 +1575,11 @@ int usb_set_interface(struct usb_device *dev, int interface, int alternate)
> for (i = 0; i < iface->cur_altsetting->desc.bNumEndpoints; i++)
> iface->cur_altsetting->endpoint[i].streams = 0;
>
> + if (dev->state == USB_STATE_NOTATTACHED)
> + return -ENODEV;

Same question: What happens if the state changes right here?

> +
> ret = usb_hcd_alloc_bandwidth(dev, NULL, iface->cur_altsetting, alt);
> +
> if (ret < 0) {
> dev_info(&dev->dev, "Not enough bandwidth for altsetting %d\n",
> alternate);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index 901ec732321c..6fb8b14469ae 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -352,10 +352,13 @@ struct usb_interface *usb_ifnum_to_if(const struct usb_device *dev,
>
> if (!config)
> return NULL;
> - for (i = 0; i < config->desc.bNumInterfaces; i++)
> - if (config->interface[i]->altsetting[0]
> - .desc.bInterfaceNumber == ifnum)
> + for (i = 0; i < config->desc.bNumInterfaces; i++) {
> + if (config->interface[i] &&

This new test can fail only if the routine is called after (or while)
the device is unconfigured or removed. But if a driver makes such a
call then the driver is buggy. The proper solution is to fix the
driver, not add this test here.

Besides, this test has the same problem as the others you added above.
What happens if config->interface[i] gets set to NULL right here?

Alan Stern

> + config->interface[i]->altsetting[0]
> + .desc.bInterfaceNumber == ifnum) {
> return config->interface[i];
> + }
> + }
>
> return NULL;
> }
> --
> 2.34.1
>