Re: [PATCH] usb: dwc3: Disable AutoRetry controller feature for dwc_usb3 v2.00a

From: Jakub Vaněk
Date: Wed Jul 12 2023 - 18:52:00 EST


Hi Thinh,

On Tue, 2023-07-11 at 21:48 +0000, Thinh Nguyen wrote:
> Hi Jakub,
>
> On Thu, Jul 06, 2023, Jakub Vaněk wrote:
> > On Wed, 2023-07-05 at 22:47 +0000, Thinh Nguyen wrote:
> > > Hi,
> > >
> > > On Sat, Jul 01, 2023, Jakub Vanek wrote:
> > > > AutoRetry has been found to cause issues in this controller
> > > > revision.
> > > > This feature allows the controller to send non-
> > > > terminating/burst retry
> > > > ACKs (Retry=1 and Nump!=0) as opposed to terminating retry ACKs
> > > > (Retry=1 and Nump=0) to devices when a transaction error
> > > > occurs.
> > > >
> > > > Unfortunately, some USB devices do not retry transfers when
> > > > the controller sends them a non-terminating retry ACK. After
> > > > the transfer
> > > > times out, the xHCI driver tries to resume normal operation of
> > > > the
> > > > controller by sending a Stop Endpoint command to it. However,
> > > > this
> > > > revision of the controller fails to respond to that command in
> > > > this
> > > > state and the xHCI driver therefore gives up. This is reported
> > > > via dmesg:
> > > >
> > > > [sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
> > > > [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
> > > > xhci-hcd: xHCI host not responding to stop endpoint command
> > > > xhci-hcd: xHCI host controller not responding, assume dead
> > > > xhci-hcd: HC died; cleaning up
> > > >
> > > > This problem has been observed on Odroid HC2. This board has
> > > > an integrated JMS578 USB3-to-SATA bridge. The above problem
> > > > could be
> > > > triggered by starting a read-heavy workload on an attached SSD.
> > > > After a while, the host controller would die and the SSD would
> > > > disappear
> > > > from the system.
> > > >
> > > > Reverting b138e23d3dff ("usb: dwc3: core: Enable AutoRetry
> > > > feature in
> > > > the controller") stopped the issue from occurring on Odroid
> > > > HC2.
> > > > As this problem hasn't been reported on other devices, disable
> > > > AutoRetry just for the dwc_usb3 revision v2.00a that the board
> > > > SoC
> > > > (Exynos 5422) uses.
> > > >
> > > > Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature
> > > > in the controller")
> > > > Link:
> > > > https://urldefense.com/v3/__https://lore.kernel.org/r/a21f34c04632d250cd0a78c7c6f4a1c9c7a43142.camel@xxxxxxxxx/__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuAJ1zxnmw$
> > > >  
> > > > Link:
> > > > https://urldefense.com/v3/__https://forum.odroid.com/viewtopic.php?t=42630__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuCzOGIVPA$
> > > >  
> > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > Cc: Mauro Ribeiro <mauro.ribeiro@xxxxxxxxxxxxxx>
> > > > Cc: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
> > > > Suggested-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
> > > > Signed-off-by: Jakub Vanek <linuxtardis@xxxxxxxxx>
> > > > ---
> > > >  drivers/usb/dwc3/core.c | 10 ++++++++--
> > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> > > > index d68958e151a7..d742fc882d7e 100644
> > > > --- a/drivers/usb/dwc3/core.c
> > > > +++ b/drivers/usb/dwc3/core.c
> > > > @@ -1218,9 +1218,15 @@ static int dwc3_core_init(struct dwc3
> > > > *dwc)
> > > >                  * Host mode on seeing transaction errors(CRC
> > > > errors or internal
> > > >                  * overrun scenerios) on IN transfers to reply
> > > > to the device
> > > >                  * with a non-terminating retry ACK (i.e, an
> > > > ACK transcation
> > > > -                * packet with Retry=1 & Nump != 0)
> > > > +                * packet with Retry=1 & Nump != 0).
> > > > +                *
> > > > +                * Do not enable this for DWC_usb3 v2.00a. This
> > > > controller
> > > > +                * revision stops responding to Stop Endpoint
> > > > commands if
> > > > +                * a USB device does not retry after a non-
> > > > terminating retry
> > > > +                * ACK is sent to it.
> > > >                  */
> > > > -               reg |= DWC3_GUCTL_HSTINAUTORETRY;
> > > > +               if (!DWC3_VER_IS(DWC3, 200A))
> > > > +                       reg |= DWC3_GUCTL_HSTINAUTORETRY;
> > > >  
> > > >                 dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
> > > >         }
> > > > --
> > > > 2.25.1
> > > >
> > >
> > > I brought up this inquiry internally. Please wait as I need to
> > > review
> > > this further. The handling for this may be different depending on
> > > the
> > > team's feedback.
> > >
> >
> > OK; feel free to contact me if I could be of any help.
> >
>
> Sorry for the delay. After discussion, this behavior is the same
> across
> different controller versions. The consensus is to disable this
> feature
> altoghether. There will not be any noticeable performance impact.
>
> Would you mind revise the patch to disable this feature for all?

Not at all, I have sent a new revision of the patch:
https://lore.kernel.org/linux-usb/20230712224037.24948-1-linuxtardis@xxxxxxxxx/

Reverting the commit altogether seemed preferable to me to updating the
revision-specific patch. However, if the other way would be better, I
can send an updated revision.

> Many thanks,
> Thinh

Thank you,
Jakub