Re: [PATCH] xhci: Do not create endpoint debugfs while holding the bandwidth mutex

From: Ricardo Ribalda
Date: Thu Jun 01 2023 - 12:05:44 EST


Hi Mathias

On Thu, 1 Jun 2023 at 16:13, Mathias Nyman
<mathias.nyman@xxxxxxxxxxxxxxx> wrote:
>
> On 31.5.2023 15.40, Ricardo Ribalda Delgado wrote:
> > xhci_debugfs_create_endpoint needs to take the mm->mmap_sem, which is
> > not serialized with the hcd->bandwidth_mutex across the codebase.
> >
> > Without this patch a deadlock has been observed with the uvc driver at
> > the functions v4l2_mmap() and usb_set_interface().
> >
> > Cc: Stephen Boyd <swboyd@xxxxxxxxxxxx
> > Fixes: 167657a1bb5f ("xhci: don't create endpoint debugfs entry before ring buffer is set.")
> > Signed-off-by: Ricardo Ribalda Delgado <ribalda@xxxxxxxxxxxx>
> > ---
> > I do not have a proper reproducer for this and I am not used to this
> > subsystem, so please take a careful look at this patch :).
> >
> > Thanks!
>
> Do you still have the lockdep output showing the deadlock?

[ 459.731142] ======================================================
[ 459.731150] WARNING: possible circular locking dependency detected
[ 459.731161] 5.4.169-lockdep-17434-g505c8a10e6fe #1 Not tainted
[ 459.731168] ------------------------------------------------------
[ 459.731176] syz-executor.3/15308 is trying to acquire lock:
[ 459.731184] ffffff80c63e0ee0 (&queue->mutex){+.+.}, at:
uvc_queue_mmap+0x30/0xa0 [uvcvideo]
[ 459.731226]
but task is already holding lock:
[ 459.731232] ffffff80a748eea8 (&mm->mmap_sem){++++}, at:
vm_mmap_pgoff+0x10c/0x1f4
[ 459.731255]
which lock already depends on the new lock.

[ 459.731262]
the existing dependency chain (in reverse order) is:
[ 459.731269]
-> #3 (&mm->mmap_sem){++++}:
[ 459.731286] __might_fault+0xec/0x150
[ 459.731298] filldir64+0x2e0/0x15dc
[ 459.731310] dcache_readdir+0x134/0x660
[ 459.731320] iterate_dir+0x200/0x40c
[ 459.731331] ksys_getdents64+0x218/0x78c
[ 459.731342] __arm64_sys_getdents64+0x7c/0x90
[ 459.731353] el0_svc_common+0x1c0/0x3dc
[ 459.731363] el0_svc_compat_handler+0x88/0xd4
[ 459.731373] el0_svc_compat+0x8/0x2c
[ 459.731379]
-> #2 (&sb->s_type->i_mutex_key#4){++++}:
[ 459.731398] down_write+0x60/0x118
[ 459.731409] start_creating+0xf8/0x260
[ 459.731419] debugfs_create_dir+0x30/0x290
[ 459.731430] xhci_debugfs_create_endpoint+0x118/0x1c8
[ 459.731442] xhci_check_bandwidth+0x520/0x6c0
[ 459.731453] usb_hcd_alloc_bandwidth+0x800/0x900
[ 459.731464] usb_set_configuration+0x4c4/0x1258
[ 459.731475] usb_generic_driver_probe+0x80/0x140
[ 459.731486] usb_probe_device+0xc8/0x240
[ 459.731497] really_probe+0x2f0/0x9e4
[ 459.731507] driver_probe_device+0xf0/0x2e4
[ 459.731517] __device_attach_driver+0x180/0x254
[ 459.731528] bus_for_each_drv+0x114/0x184
[ 459.731537] __device_attach+0x224/0x398
[ 459.731547] device_initial_probe+0x24/0x30
[ 459.731557] bus_probe_device+0xa8/0x1b8
[ 459.731567] device_add+0x63c/0x940
[ 459.731577] usb_new_device+0x810/0xe68
[ 459.731587] hub_port_connect+0xaf0/0x16bc
[ 459.731597] hub_port_connect_change+0x32c/0x5c0
[ 459.731606] port_event+0x9e8/0xe4c
[ 459.731616] hub_event+0x3fc/0xaec
[ 459.731628] process_one_work+0x604/0xa2c
[ 459.731638] worker_thread+0x930/0xea8
[ 459.731648] kthread+0x2dc/0x350
[ 459.731658] ret_from_fork+0x10/0x18
[ 459.731664]
-> #1 (hcd->bandwidth_mutex){+.+.}:
[ 459.731680] __mutex_lock_common+0x140/0x18c8
[ 459.731690] mutex_lock_nested+0x48/0x58
[ 459.731701] usb_set_interface+0x108/0x778
[ 459.731724] uvc_video_start_transfer+0x824/0x12a4 [uvcvideo]
[ 459.731746] uvc_video_start_streaming+0x148/0x2c8 [uvcvideo]
[ 459.731767] uvc_start_streaming+0x108/0x214 [uvcvideo]
[ 459.731779] vb2_start_streaming+0x110/0x3f0
[ 459.731789] vb2_core_streamon+0x234/0x340
[ 459.731799] vb2_streamon+0x80/0xac
[ 459.731819] uvc_queue_streamon+0x3c/0x5c [uvcvideo]
[ 459.731840] uvc_ioctl_streamon+0xd0/0x118 [uvcvideo]
[ 459.731850] v4l_streamon+0x6c/0x9c
[ 459.731860] __video_do_ioctl+0x940/0xaa8
[ 459.731870] video_usercopy+0x528/0x920
[ 459.731880] video_ioctl2+0x3c/0x4c
[ 459.731889] v4l2_ioctl+0x120/0x158
[ 459.731900] do_video_ioctl+0xdec/0x1784
[ 459.731910] v4l2_compat_ioctl32+0xc0/0x198
[ 459.731921] __arm64_compat_sys_ioctl+0x314/0x778
[ 459.731931] el0_svc_common+0x1c0/0x3dc
[ 459.731941] el0_svc_compat_handler+0x88/0xd4
[ 459.731950] el0_svc_compat+0x8/0x2c
[ 459.731957]
-> #0 (&queue->mutex){+.+.}:
[ 459.731974] __lock_acquire+0x1b74/0x4f04
[ 459.731985] lock_acquire+0xd0/0x168
[ 459.731995] __mutex_lock_common+0x140/0x18c8
[ 459.732004] mutex_lock_nested+0x48/0x58
[ 459.732024] uvc_queue_mmap+0x30/0xa0 [uvcvideo]
[ 459.732045] uvc_v4l2_mmap+0xa4/0x138 [uvcvideo]
[ 459.732054] v4l2_mmap+0x114/0x1f8
[ 459.732065] mmap_region+0x8b8/0xc90
[ 459.732075] do_mmap+0x654/0xaec
[ 459.732084] vm_mmap_pgoff+0x15c/0x1f4
[ 459.732094] ksys_mmap_pgoff+0x124/0x194
[ 459.732105] __arm64_compat_sys_aarch32_mmap2+0xd8/0xf0
[ 459.732114] el0_svc_common+0x1c0/0x3dc
[ 459.732124] el0_svc_compat_handler+0x88/0xd4
[ 459.732134] el0_svc_compat+0x8/0x2c
[ 459.732141]
other info that might help us debug this:

[ 459.732148] Chain exists of:
&queue->mutex --> &sb->s_type->i_mutex_key#4 --> &mm->mmap_sem

[ 459.732165] Possible unsafe locking scenario:

[ 459.732172] CPU0 CPU1
[ 459.732178] ---- ----
[ 459.732184] lock(&mm->mmap_sem);
[ 459.732193] lock(&sb->s_type->i_mutex_key#4);
[ 459.732204] lock(&mm->mmap_sem);
[ 459.732212] lock(&queue->mutex);
[ 459.732221]
*** DEADLOCK ***

[ 459.732230] 1 lock held by syz-executor.3/15308:
[ 459.732237] #0: ffffff80a748eea8 (&mm->mmap_sem){++++}, at:
vm_mmap_pgoff+0x10c/0x1f4
[ 459.732256]
stack backtrace:
[ 459.732269] CPU: 6 PID: 15308 Comm: syz-executor.3 Not tainted
5.4.169-lockdep-17434-g505c8a10e6fe #1
[ 459.732277] Hardware name: Google Pazquel (TI,LTE) (DT)
[ 459.732284] Call trace:
[ 459.732294] dump_backtrace+0x0/0x2ec
[ 459.732304] show_stack+0x24/0x30
[ 459.732315] dump_stack+0x148/0x21c
[ 459.732324] print_circular_bug+0x18c/0x1b8
[ 459.732334] check_noncircular+0x2e4/0x3c4
[ 459.732344] __lock_acquire+0x1b74/0x4f04
[ 459.732355] lock_acquire+0xd0/0x168
[ 459.732364] __mutex_lock_common+0x140/0x18c8
[ 459.732374] mutex_lock_nested+0x48/0x58
[ 459.732395] uvc_queue_mmap+0x30/0xa0 [uvcvideo]
[ 459.732415] uvc_v4l2_mmap+0xa4/0x138 [uvcvideo]
[ 459.732425] v4l2_mmap+0x114/0x1f8
[ 459.732435] mmap_region+0x8b8/0xc90
[ 459.732444] do_mmap+0x654/0xaec
[ 459.732454] vm_mmap_pgoff+0x15c/0x1f4
[ 459.732463] ksys_mmap_pgoff+0x124/0x194
[ 459.732474] __arm64_compat_sys_aarch32_mmap2+0xd8/0xf0
[ 459.732483] el0_svc_common+0x1c0/0x3dc
[ 459.732493] el0_svc_compat_handler+0x88/0xd4
[ 459.732502] el0_svc_compat+0x8/0x2c


>
> I'm not sure how calling xhci_debugfs_create_endpoint() from
> xhci_add_endpoint() instead of xhci_check_bandwidth() helps.
>
> Both are called with hcd->bandwidth_mutex held:
>
> usb_set_interface()
> mutex_lock(hcd->bandwidth_mutex);
> usb_hcd_alloc_bandwidth()
> hcd->driver->add_endpoint() -> xhci_add_endpoint()
> hcd->driver->check_bandwidth() -> xhci_check_bandwidth()
> mutex_unlock(hcd->bandwidth_mutex);

Yep, I guess I was lucky not to be able to repro again :)

The locks involved are:

hcd->bandwidth_mutex
mm->mmap_sem
[uvc] queue->mutex



>
> Thanks
> Mathias
>


--
Ricardo Ribalda