RE: Re: [PATCH] USB: Fix race condition during UVC webcam disconnect

From: AMAN DEEP
Date: Fri Jul 28 2023 - 01:01:55 EST




Thanks for your detailed analysis. 
Sorry for late reply and my answers are inline.
 
>On Thu, Jul 20, 2023 at 05:01:42PM +0530, Aman Deep wrote:
>> In the bug happened during uvc webcam disconect,there is race
>> between stopping a video transfer and usb disconnect. This issue is
>> reproduced in my system running Linux kernel when UVC webcam play is
>> stopped and UVC webcam is disconnected at the same time. This causes the
>> below backtrace:
>>
>> [2-3496.7275]  PC is at 0xbf418000+0x2d8 [usbcore]
>> [2-3496.7275]  LR is at 0x00000005
>> [2-3496.7275] pc : [<bf4182d8>]((usb_ifnum_to_if
>> </drivers/usb/core/usb.c:283
>> [usbcore.ko]>)) lr : [<00000005>]() psr: 20000013
>> [2-3496.7275] Function entered at [<bf4182a4>]((usb_ifnum_to_if
>> </drivers/usb/core/usb.c:275
>> [usbcore.ko]>)) (0xbf418000+0x2a4 [usbcore]) from
>> [<bf423974>]((usb_hcd_alloc_bandwidth
>> </drivers/usb/core/hcd.c:1947
>> [usbcore.ko]>)) (0xbf418000+0xb974 [usbcore])
>> [2-3496.7275] Function entered at [<bf423738>]((usb_hcd_alloc_bandwidth
>> </drivers/usb/core/hcd.c:1876
>> [usbcore.ko]>)) (0xbf418000+0xb738 [usbcore]) from
>> [<bf426ca0>]((usb_set_interface
>> </drivers/usb/core/message.c:1461
>> [usbcore.ko]>)) (0xbf418000+0xeca0 [usbcore])
>> [2-3496.7275] Function entered at [<bf426b9c>]((usb_set_interface
>> </drivers/usb/core/message.c:1385
>> [usbcore.ko]>)) (0xbf418000+0xeb9c [usbcore]) from
>> [<bf9c4dd4>]((uvc_video_clock_cleanup
>> </drivers/media/usb/uvc/uvc_video.c:598
>> uvc_video_stop_streaming
>> </drivers/media/usb/uvc/uvc_video.c:2221
>> [uvcvideo.ko]>)) (0xbf9bd000+0x7dd4 [uvcvideo])
>> [2-3496.7275] Function entered at [<bf9c4d98>]((uvc_video_stop_streaming
>> </drivers/media/usb/uvc/uvc_video.c:2200
>> [uvcvideo.ko]>)) (0xbf9bd000+0x7d98 [uvcvideo]) from
>> [<bf9bfab8>]((spin_lock_irq
>> </include/linux/spinlock.h:363
>> uvc_stop_streaming
>> </drivers/media/usb/uvc/uvc_queue.c:194
>> [uvcvideo.ko]>)) (0xbf9bd000+0x2ab8 [uvcvideo])
>> [2-3496.7276] Function entered at [<bf9bfa94>]((uvc_stop_streaming
>> </drivers/media/usb/uvc/uvc_queue.c:186
>> [uvcvideo.ko]>)) (0xbf9bd000+0x2a94 [uvcvideo]) from
>> [<be307150>]((__read_once_size
>> </include/linux/compiler.h:290
>> (discriminator 1) __vb2_queue_cancel
>> </drivers/media/common/videobuf2/videobuf2-core.c:1893
>> (discriminator 1) [videobuf2_common.ko]>)) (0xbe306000+0x1150
>> [videobuf2_common])
>> [2-3496.7276] Function entered at [<be307120>]((__vb2_queue_cancel
>> </drivers/media/common/videobuf2/videobuf2-core.c:1877
>> [videobuf2_common.ko]>)) (0xbe306000+0x1120 [videobuf2_common]) from
>> [<be308894>]((vb2_core_streamoff
>> </drivers/media/common/videobuf2/videobuf2-core.c:2053
>
>This backtrace doesn't show what actually caused the bug.  You should
>have included several lines from the system log _preceding_ the
>backtrace.  Without those lines, we can only guess at what the problem
>was.


This is below complete backtrace for crash problem:


[1-221.1821] [    msg: 4788] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[1-221.1821] [    msg: 4788] pgd = e136ded4
[1-221.1821] [    msg: 4788] [00000000] *pgd=26210003, *pmd=00000000
[1-221.1821] [    msg: 4788] 
[1-221.1821] [    msg: 4788] Die cpu info :
[1-221.1821] [    msg: 4788] Internal error: Oops: 206 [#1] PREEMPT SMP ARM
[1-221.1821] [    msg: 4788] CPU: 1 PID: 4788 Comm: msg(muse-server) Kdump: loaded Tainted: PO 5.4.77 #1 PPID: 1 PComm: systemd preempt_count: 0x0
[1-221.1821] [    msg: 4788] Hardware name: Novatek Cortex-A53
[1-221.1822] [    msg: 4788] PC is at usb_ifnum_to_if+0x30/0x74 [usbcore]
[1-221.1822] [    msg: 4788] LR is at 0x5
[1-221.1822] [    msg: 4788] pc : [<bede1300>]    lr : [<00000005>]    psr: 20000113
[1-221.1822] [    msg: 4788] sp : ca443c18  ip : ca443c28  fp : ca443c24
[1-221.1822] [    msg: 4788] r10: e668b6c8  r9 : 00000000  r8 : e668b7e0
[1-221.1822] [    msg: 4788] r7 : e7b78880  r6 : bf1d9db0  r5 : e668b6c8  r4 : e690c000
[1-221.1822] [    msg: 4788] r3 : 00002000  r2 : e696ac40  r1 : 00000001  r0 : 00000000
[1-221.1822] [    msg: 4788] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[1-221.1822] [    msg: 4788] Control: 30c5383d  Table: 261f8a80  DAC: e45d65d5
[1-221.1822] [    msg: 4788] Process msg (pid: 4788, stack limit = 0xa0153238)
[1-221.1822] [    msg: 4788] Stack: (0xca443c18 to 0xca444000)
[1-221.1822] [    msg: 4788] 3c00:                                                       ca443c64 ca443c28
[1-221.1822] [    msg: 4788] 3c20: bedee6e4 bede12dc 00000000 bee0ae78 ca443c54 ca443c40 c083c894 e7b78880
[1-221.1822] [    msg: 4788] 3c40: e6b88340 00000000 bee0ae78 00000001 e690c000 e668b6c8 ca443cb4 ca443c68
[1-221.1822] [    msg: 4788] 3c60: bedf22ac bedee64c e5cf1508 e5cf1508 e5cf0000 e5cf0330 00000001 e5cf0330
[1-221.1822] [    msg: 4788] 3c80: ca443ca4 ca443c90 c083c894 e5cf0000 e5cf0330 00000001 e5cf0330 00000000
[1-221.1822] [    msg: 4788] 3ca0: 00000001 c08d1b3c ca443ccc ca443cb8 bfb3f958 bedf1ff4 e5cf0330 e5cf0330
[1-221.1822] [    msg: 4788] 3cc0: ca443ce4 ca443cd0 bfb3a024 bfb3f8a8 e5cf0330 e5cf0330 ca443d14 ca443ce8
[1-221.1823] [    msg: 4788] 3ce0: be3661e0 bfb3a004 00000001 e5cf0330 e5cf0330 00000001 c05d6260 00000000
[1-221.1823] [    msg: 4788] 3d00: 00000001 c08d1b3c ca443d2c ca443d18 be367994 be3661b4 e5cf0484 e5cf0330
[1-221.1823] [    msg: 4788] 3d20: ca443d3c ca443d30 be37e3e4 be367978 ca443d5c ca443d40 bfb3a518 be37e3cc
[1-221.1823] [    msg: 4788] 3d40: e5cf030c e5cf0000 00000001 c05d6260 ca443d7c ca443d60 bfb3b628 bfb3a4f0
[1-221.1823] [    msg: 4788] 3d60: bfb3b5e8 40045613 00000000 c05d6260 ca443d94 ca443d80 c05d6288 bfb3b5f4
[1-221.1823] [    msg: 4788] 3d80: e5cf0010 40045613 ca443dfc ca443d98 c05d9b84 c05d626c 00000068 ca443deb
[1-221.1823] [    msg: 4788] 3da0: c08d1b3c 00000001 ca443e24 bfb44680 00000000 e2fa3780 c01a926c 031e1090
[1-221.1823] [    msg: 4788] 3dc0: ca443df4 ffffffff c01e0048 0000072c 000012b4 00000000 40045613 00000000
[1-221.1823] [    msg: 4788] 3de0: 00000000 00000001 00000004 ca443e24 ca443ed4 ca443e00 c05db320 c05d9a04
[1-221.1823] [    msg: 4788] 3e00: 00000000 00000000 c05d99f8 e77a6700 ab8fd26c 00000000 00000000 00000000
[1-221.1823] [    msg: 4788] 3e20: ca443f60 00000001 ca443ee0 00000000 ca443e9c ca443e40 c02390a8 be211e84
[1-221.1823] [    msg: 4788] 3e40: 00000000 00000001 e2861600 00000000 00000000 00000000 00000000 00000000
[1-221.1823] [    msg: 4788] 3e60: 00000000 00000000 00000000 c03681bc 00000008 00000000 ca443ee0 c0bbd748
[1-221.1823] [    msg: 4788] 3e80: 00000000 c0be9a14 ca443ef4 00000002 ca443ed4 ca443ea0 c03681bc c036790c
[1-221.1823] [    msg: 4788] 3ea0: ca443ef4 c0bbd748 e2861600 c05db7dc e6695448 40045613 ab8fd26c e77a6700
[1-221.1823] [    msg: 4788] 3ec0: 00000021 00000036 ca443ee4 ca443ed8 c05db7fc c05db0f8 ca443efc ca443ee8
[1-221.1823] [    msg: 4788] 3ee0: c05d4728 c05db7e8 ab8fd26c e6695448 ca443f6c ca443f00 c02506a0 c05d46e8
[1-221.1823] [    msg: 4788] 3f00: ca443f04 c08a7a00 00000000 00000000 00000000 00000000 00000000 00000000
[1-221.1823] [    msg: 4788] 3f20: 00000000 00000000 00000000 00000000 ab8fd26c c0abb6ec ab8fd26c e77a6700
[1-221.1823] [    msg: 4788] 3f40: ca443f6c e77a6701 00000000 40045613 ab8fd26c e77a6700 00000021 00000036
[1-221.1824] [    msg: 4788] 3f60: ca443f94 ca443f70 c0250b3c c02502fc 00000000 000006f7 00000000 00000036
[1-221.1824] [    msg: 4788] 3f80: c000924c ca442000 ca443fa4 ca443f98 c0250b78 c0250adc 00000000 ca443fa8
[1-221.1824] [    msg: 4788] 3fa0: c0009230 c0250b6c 00000000 000006f7 00000021 40045613 ab8fd26c 00000021
[1-221.1824] [    msg: 4788] 3fc0: 00000000 000006f7 00000000 00000036 abb79e30 00000000 00000001 abb79e28
[1-221.1824] [    msg: 4788] 3fe0: aeca607c ab8fd24c aec8e749 b5f1ed1c 20000010 00000021 00000000 00000000
[1-221.1824] [    msg: 4788] Backtrace: 
[1-221.1824] [    msg: 4788] [<bede12d0>] (usb_ifnum_to_if [usbcore]) from [<bedee6e4>] (usb_hcd_alloc_bandwidth+0xa4/0x564 [usbcore])
[1-221.1824] [    msg: 4788] [<bedee640>] (usb_hcd_alloc_bandwidth [usbcore]) from [<bedf22ac>] (usb_set_interface+0x2c4/0x61c [usbcore])
[1-221.1824] [    msg: 4788]  r10:e668b6c8 r9:e690c000 r8:00000001 r7:bee0ae78 r6:00000000 r5:e6b88340
[1-221.1824] [    msg: 4788]  r4:e7b78880
[1-221.1825] [    msg: 4788] [<bedf1fe8>] (usb_set_interface [usbcore]) from [<bfb3f958>] (uvc_video_stop_streaming+0xbc/0xc4 [uvcvideo])
[1-221.1825] [    msg: 4788]  r10:c08d1b3c r9:00000001 r8:00000000 r7:e5cf0330 r6:00000001 r5:e5cf0330
[1-221.1825] [    msg: 4788]  r4:e5cf0000
[1-221.1825] [    msg: 4788] [<bfb3f89c>] (uvc_video_stop_streaming [uvcvideo]) from [<bfb3a024>] (uvc_stop_streaming+0x2c/0x50 [uvcvideo])
[1-221.1825] [    msg: 4788]  r5:e5cf0330 r4:e5cf0330
[1-221.1825] [    msg: 4788] [<bfb39ff8>] (uvc_stop_streaming [uvcvideo]) from [<be3661e0>] (__vb2_queue_cancel+0x38/0x290 [videobuf2_common])
[1-221.1825] [    msg: 4788]  r5:e5cf0330 r4:e5cf0330
[1-221.1825] [    msg: 4788] [<be3661a8>] (__vb2_queue_cancel [videobuf2_common]) from [<be367994>] (vb2_core_streamoff+0x28/0xb4 [videobuf2_common])
[1-221.1825] [    msg: 4788]  r10:c08d1b3c r9:00000001 r8:00000000 r7:c05d6260 r6:00000001 r5:e5cf0330
[1-221.1825] [    msg: 4788]  r4:e5cf0330 r3:00000001
[1-221.1825] [    msg: 4788] [<be36796c>] (vb2_core_streamoff [videobuf2_common]) from [<be37e3e4>] (vb2_streamoff+0x24/0x60 [videobuf2_v4l2])
[1-221.1825] [    msg: 4788]  r5:e5cf0330 r4:e5cf0484
[1-221.1825] [    msg: 4788] [<be37e3c0>] (vb2_streamoff [videobuf2_v4l2]) from [<bfb3a518>] (uvc_queue_streamoff+0x34/0x48 [uvcvideo])
[1-221.1825] [    msg: 4788] [<bfb3a4e4>] (uvc_queue_streamoff [uvcvideo]) from [<bfb3b628>] (uvc_ioctl_streamoff+0x40/0x58 [uvcvideo])
[1-221.1826] [    msg: 4788]  r7:c05d6260 r6:00000001 r5:e5cf0000 r4:e5cf030c
[1-221.1826] [    msg: 4788] [<bfb3b5e8>] (uvc_ioctl_streamoff [uvcvideo]) from [<c05d6288>] (v4l_streamoff+0x28/0x2c)
[1-221.1826] [    msg: 4788]  r7:c05d6260 r6:00000000 r5:40045613 r4:bfb3b5e8
[1-221.1826] [    msg: 4788] [<c05d6260>] (v4l_streamoff) from [<c05d9b84>] (__video_do_ioctl+0x18c/0x400)
[1-221.1826] [    msg: 4788]  r5:40045613 r4:e5cf0010
[1-221.1826] [    msg: 4788] [<c05d99f8>] (__video_do_ioctl) from [<c05db320>] (video_usercopy+0x234/0x6f0)
[1-221.1826] [    msg: 4788]  r10:ca443e24 r9:00000004 r8:00000001 r7:00000000 r6:00000000 r5:40045613
[1-221.1826] [    msg: 4788]  r4:00000000
[1-221.1826] [    msg: 4788] [<c05db0ec>] (video_usercopy) from [<c05db7fc>] (video_ioctl2+0x20/0x24)
[1-221.1826] [    msg: 4788]  r10:00000036 r9:00000021 r8:e77a6700 r7:ab8fd26c r6:40045613 r5:e6695448
[1-221.1826] [    msg: 4788]  r4:c05db7dc
[1-221.1826] [    msg: 4788] [<c05db7dc>] (video_ioctl2) from [<c05d4728>] (v4l2_ioctl+0x4c/0x60)
[1-221.1826] [    msg: 4788] [<c05d46dc>] (v4l2_ioctl) from [<c02506a0>] (do_vfs_ioctl+0x3b0/0x7e0)
[1-221.1826] [    msg: 4788]  r5:e6695448 r4:ab8fd26c
[1-221.1826] [    msg: 4788] [<c02502f0>] (do_vfs_ioctl) from [<c0250b3c>] (ksys_ioctl+0x6c/0x90)
[1-221.1826] [    msg: 4788]  r10:00000036 r9:00000021 r8:e77a6700 r7:ab8fd26c r6:40045613 r5:00000000
[1-221.1826] [    msg: 4788]  r4:e77a6701
[1-221.1826] [    msg: 4788] [<c0250ad0>] (ksys_ioctl) from [<c0250b78>] (sys_ioctl+0x18/0x1c)
[1-221.1826] [    msg: 4788]  r9:ca442000 r8:c000924c r7:00000036 r6:00000000 r5:000006f7 r4:00000000
[1-221.1827] [    msg: 4788] [<c0250b60>] (sys_ioctl) from [<c0009230>] (__sys_trace_return+0x0/0x10)


>> This below solution patch fixes this race condition at USB core level
>> occurring during UVC webcam device disconnect.
>
>How can a race in the UVC driver be fixed by changing the USB core?  It
>seems obvious that the only way to fix such a race is by changing the
>UVC driver.


I think solution fixed at USB core level avoids race condition for all kind of devices and drivers.


>> Signed-off-by: Anuj Gupta <anuj01.gupta@xxxxxxxxxxx>
>> Signed-off-by: Aman Deep <aman.deep@xxxxxxxxxxx>
>> ---
>>  drivers/usb/core/hcd.c    | 7 ++++++-
>>  drivers/usb/core/message.c | 4 ++++
>>  drivers/usb/core/usb.c    | 9 ++++++---
>>  3 files changed, 16 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
>> index 8300baedafd2.. a06452cbbaa4 100644
>> --- a/drivers/usb/core/hcd.c
>> +++ b/drivers/usb/core/hcd.c
>> @@ -1931,7 +1931,12 @@ int usb_hcd_alloc_bandwidth(struct usb_device *udev,
>>                 }
>>         }
>>         if (cur_alt && new_alt) {
>> -                struct usb_interface *iface = usb_ifnum_to_if(udev,
>> +                struct usb_interface *iface;
>> +
>> +                if (udev->state == USB_STATE_NOTATTACHED)
>> +                        return -ENODEV;
>
>What will happen if the state changes to USB_STATE_NOTATTACHED at this
>point, after that test was made?


Please suggest if i should add some propr locking mechanism to avoid this.
I will add accordingly.


>> +
>> +                iface = usb_ifnum_to_if(udev,
>>                                 cur_alt->desc.bInterfaceNumber);
>
>>                 if (!iface)
>> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
>> index b5811620f1de.. f31c7287dc01 100644
>> --- a/drivers/usb/core/message.c
>> +++ b/drivers/usb/core/message.c
>> @@ -1575,7 +1575,11 @@ int usb_set_interface(struct usb_device *dev, int interface, int alternate)
>>         for (i = 0; i < iface->cur_altsetting->desc.bNumEndpoints; i++)
>>                 iface->cur_altsetting->endpoint[i].streams = 0;
>
>> +        if (dev->state == USB_STATE_NOTATTACHED)
>> +                return -ENODEV;
>
>Same question: What happens if the state changes right here?


yes, please suggest more required changes for it.


>> +
>>         ret = usb_hcd_alloc_bandwidth(dev, NULL, iface->cur_altsetting, alt);
>> +
>>         if (ret < 0) {
>>                 dev_info(&dev->dev, "Not enough bandwidth for altsetting %d\n",
>>                                 alternate);
>> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
>> index 901ec732321c.. 6fb8b14469ae 100644
>> --- a/drivers/usb/core/usb.c
>> +++ b/drivers/usb/core/usb.c
>> @@ -352,10 +352,13 @@ struct usb_interface *usb_ifnum_to_if(const struct usb_device *dev,
>
>>         if (!config)
>>                 return NULL;
>> -        for (i = 0; i < config->desc.bNumInterfaces; i++)
>> -                if (config->interface[i]->altsetting[0]
>> -                                .desc.bInterfaceNumber == ifnum)
>> +        for (i = 0; i < config->desc.bNumInterfaces; i++) {
>> +                if (config->interface[i] &&
>
>This new test can fail only if the routine is called after (or while)
>the device is unconfigured or removed.  But if a driver makes such a
>call then the driver is buggy.  The proper solution is to fix the
>driver, not add this test here.
>
>Besides, this test has the same problem as the others you added above. 
>What happens if config->interface[i] gets set to NULL right here?
>
>Alan Stern
>

All above changes resolve existing issue and at USB core level, it will avoid simillar race conditions for all others.


>> +                                config->interface[i]->altsetting[0]
>> +                                .desc.bInterfaceNumber == ifnum) {
>>                         return config->interface[i];
>> +                }
>> +        }
>
>>         return NULL;
>>  }
>> --
>> 2.34.1
>>