RE: [PATCH] usb: gadget: uvc_video: unlock before submitting a request to ep

From: Pandey, Radhey Shyam
Date: Wed Jan 10 2024 - 04:14:58 EST


> -----Original Message-----
> From: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
> Sent: Friday, November 17, 2023 8:59 AM
> To: Dan Scally <dan.scally@xxxxxxxxxxxxxxxx>
> Cc: Kuen-Han Tsai <khtsai@xxxxxxxxxx>; gregkh@xxxxxxxxxxxxxxxxxxx;
> laurent.pinchart@xxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> usb@xxxxxxxxxxxxxxx; Simek, Michal <michal.simek@xxxxxxx>; Mehta,
> Piyush <piyush.mehta@xxxxxxx>; Pandey, Radhey Shyam
> <radhey.shyam.pandey@xxxxxxx>; Paladugu, Siva Durga Prasad
> <siva.durga.prasad.paladugu@xxxxxxx>
> Subject: Re: [PATCH] usb: gadget: uvc_video: unlock before submitting a
> request to ep
>
> On Fri, Nov 17, 2023, Thinh Nguyen wrote:
> > Hi,
> >
> > On Thu, Nov 16, 2023, Dan Scally wrote:
> > > CC Thinh - sorry to bother you, just want to make sure we fix this in the
> right place.
> > >
> > > On 08/11/2023 11:48, Kuen-Han Tsai wrote:
> > > > On 02/11/2023 07:11, Piyush Mehta wrote:
> > > > > There could be chances where the usb_ep_queue() could fail and
> > > > > trigger
> > > > > complete() handler with error status. In this case, if
> > > > > usb_ep_queue() is called with lock held and the triggered
> > > > > complete() handler is waiting for the same lock to be cleared
> > > > > could result in a deadlock situation and could result in system
> > > > > hang. To aviod this scenerio, call usb_ep_queue() with lock removed.
> This patch does the same.
> > > > I would like to provide more background information on this problem.
> > > >
> > > > We met a deadlock issue on Android devices and the followings are
> stack traces.
> > > >
> > > > [35845.978435][T18021] Core - Debugging Information for Hardlockup
> > > > core(8) - locked CPUs mask (0x100) [35845.978442][T18021] Call trace:
> > > > [*][T18021] queued_spin_lock_slowpath+0x84/0x388
> > > > [35845.978451][T18021] uvc_video_complete+0x180/0x24c
> > > > [35845.978458][T18021] usb_gadget_giveback_request+0x38/0x14c
> > > > [35845.978464][T18021] dwc3_gadget_giveback+0xe4/0x218
> > > > [35845.978469][T18021]
> > > > dwc3_gadget_ep_cleanup_cancelled_requests+0xc8/0x108
> > > > [35845.978474][T18021] __dwc3_gadget_kick_transfer+0x34c/0x368
> > > > [35845.978479][T18021] __dwc3_gadget_start_isoc+0x13c/0x3b8
> > > > [35845.978483][T18021] dwc3_gadget_ep_queue+0x150/0x2f0
> > > > [35845.978488][T18021] usb_ep_queue+0x58/0x16c
> > > > [35845.978493][T18021] uvcg_video_pump+0x22c/0x518
> > >
> > >
> > > I note in the kerneldoc comment for usb_ep_queue() that calling
> > > .complete() from within itself is specifically disallowed [1]:
> > >
> > >     Note that @req's ->complete() callback must never be called from
> > >
> > >     within usb_ep_queue() as that can create deadlock situations.
> > >
> > >
> > > And it looks like that's what's happening here - is this something
> > > that needs addressing in the dwc3 driver?
> > >
> >
> > Looks like it. The issue is in dwc3. It should only affect isoc
> > request queuing.
> >
> > Can we try with this patch:
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 858fe4c299b7..37e08eed49d9 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1684,12 +1684,15 @@ static int __dwc3_gadget_kick_transfer(struct
> dwc3_ep *dep)
> > dwc3_gadget_move_cancelled_request(req,
> > DWC3_REQUEST_STATUS_DEQUEUED);
> >
> > /* If ep isn't started, then there's no end transfer pending */
> > - if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> > + if (!(dep->flags & DWC3_EP_PENDING_REQUEST) &&
> > + !(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> > dwc3_gadget_ep_cleanup_cancelled_requests(dep);
> >
> > return ret;
> > }
> >
> > + dep->flags &= ~DWC3_EP_PENDING_REQUEST;
> > +
> > if (dep->stream_capable && req->request.is_last &&
> > !DWC3_MST_CAPABLE(&dep->dwc->hwparams))
> > dep->flags |= DWC3_EP_WAIT_TRANSFER_COMPLETE;
> >
> > ---
> >
>
> Actually, please ignore the above, that's not correct. I'll send out a proper
> patch later.

Thanks, Thinh. I came across this thread and wanted to check if you
have some fix ready?