Re: [PATCH] usb: xhci-ring: set all cancelled_td's cancel_status to TD_CLEARING_CACHE

From: wat
Date: Fri Aug 13 2021 - 06:02:56 EST


On 2021-08-13 17:09, Mathias Nyman wrote:
On 13.8.2021 11.44, wat@xxxxxxxxxxxxxx wrote:
On 2021-08-13 15:25, Ikjoon Jang wrote:
Hi,

On Fri, Aug 13, 2021 at 10:44 AM Tao Wang <wat@xxxxxxxxxxxxxx> wrote:

USB SSD may fail to unmount if disconnect during data transferring.

it stuck in usb_kill_urb() due to urb use_count will not become zero,
this means urb giveback is not happen.
in xhci_handle_cmd_set_deq() will giveback urb if td's cancel_status
equal to TD_CLEARING_CACHE,
but in xhci_invalidate_cancelled_tds(), only last canceled td's
cancel_status change to TD_CLEARING_CACHE,
thus giveback only happen to last urb.

this change set all cancelled_td's cancel_status to TD_CLEARING_CACHE
rather than the last one, so all urb can giveback.

Signed-off-by: Tao Wang <wat@xxxxxxxxxxxxxx>
---
 drivers/usb/host/xhci-ring.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 8fea44b..c7dd7c0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -960,19 +960,19 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
                        td_to_noop(xhci, ring, td, false);
                        td->cancel_status = TD_CLEARED;
                }
-       }
-       if (cached_td) {
-               cached_td->cancel_status = TD_CLEARING_CACHE;
-
-               err = xhci_move_dequeue_past_td(xhci, slot_id, ep->ep_index,
-                                               cached_td->urb->stream_id,
-                                               cached_td);
-               /* Failed to move past cached td, try just setting it noop */
-               if (err) {
-                       td_to_noop(xhci, ring, cached_td, false);
-                       cached_td->cancel_status = TD_CLEARED;
+               if (cached_td) {
+                       cached_td->cancel_status = TD_CLEARING_CACHE;
+
+                       err = xhci_move_dequeue_past_td(xhci, slot_id, ep->ep_index,
+                                                       cached_td->urb->stream_id,
+                                                       cached_td);
+                       /* Failed to move past cached td, try just setting it noop */
+                       if (err) {
+                               td_to_noop(xhci, ring, cached_td, false);
+                               cached_td->cancel_status = TD_CLEARED;
+                       }
+                       cached_td = NULL;
                }
-               cached_td = NULL;

I think we can call xhci_move_dequeue_past_td() just once to
the last halted && cancelled TD in a ring.

But that might need to compare two TDs to see which one is
the latter, I'm not sure how to do this well. :-/

if (!cached_td || cached_td < td)
  cached_td = td;


thanks, I think you are correct that we can call xhci_move_dequeue_past_td() just once to
 the last halted && cancelled TD in a ring,
but the set status "cached_td->cancel_status = TD_CLEARING_CACHE;" should be every cancelled TD.
I am not very good at td and ring, I have a question why we need to
compare two TDs to see which one is the latter.

I'm debugging the exact same issue.
For normal endpoints (no streams) it should be enough to set
cancel_td->cancel_status = TD_CLEARING_CACHE
in the TD_DIRTY and TD_HALTED case.

We don't need to move the dq past the last cancelled TD as other
cancelled TDs are set to no-op, and
the command to move the dq will flush the xHC controllers TD cache and
read the no-ops.
(just make sure we call xhci_move_dequeue_past_td() _after_
overwriting cancelled TDs with no-op)

Streams get trickier as each endpoint has several rings, and we might
need to move the dq pointer for
many stream rings on that endpoint. This needs more work as we
shouldn't start the endpoint before all
the all move dq commands complete. i.e. the current ep->ep_state &=
~SET_DEQ_PENDING isn't enough.

-Mathias
ok, thanks, please tell me if you have a great solution after debugging, I still need to learn from you.