Re: [RFC] usb: host: xhci: Remove the watchdog timer and use command timer to watch stop endpoint command

From: Lu Baolu
Date: Thu Dec 01 2016 - 00:45:29 EST


Hi,

On 11/30/2016 05:02 PM, Baolin Wang wrote:
> If the hardware never responds to the stop endpoint command, the
> URBs will never be completed, and we might hang the USB subsystem.
> The original watchdog timer is used to watch if one stop endpoint
> command is timeout, if timeout, then the watchdog timer will set
> XHCI_STATE_DYING, try to halt the xHCI host, and give back all
> pending URBs.
>
> But now we already have one command timer to control command timeout,
> thus we can also use the command timer to watch the stop endpoint
> command, instead of one duplicate watchdog timer which need to be
> removed.
>
> Meanwhile we don't need the 'stop_cmds_pending' flag to identy if
> this is the last stop endpoint command of one endpoint. Since we
> can make sure we only set one stop endpoint command for one endpoint
> by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove
> this flag.

I am afraid you can't do this. "stop_cmds_pending" was added
to fix the problem described in the comments that you want to
remove. But I didn't find any fix of this problem in your patch.

- * The timer may also fire if the host takes a very long time to respond to the
- * command, and the stop endpoint command completion handler cannot delete the
- * timer before the timer function is called. Another endpoint cancellation may
- * sneak in before the timer function can grab the lock, and that may queue
- * another stop endpoint command and add the timer back. So we cannot use a
- * simple flag to say whether there is a pending stop endpoint command for a
- * particular endpoint.
- *
- * Instead we use a combination of that flag and a counter for the number of
- * pending stop endpoint commands. If the timer is the tail end of the last
- * stop endpoint command, and the endpoint's command is still pending, we assume
- * the host is dying.

Best regards,
Lu Baolu

>
> We also need to clean up the command queue before trying to halt the
> xHCI host in xhci_stop_endpoint_command_timeout() function.
>
> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx>