Re: [PATCH] usb: dwc3: Disable AutoRetry controller feature for dwc_usb3 v2.00a

From: Thinh Nguyen
Date: Wed Jul 05 2023 - 18:48:23 EST


Hi,

On Sat, Jul 01, 2023, Jakub Vanek wrote:
> AutoRetry has been found to cause issues in this controller revision.
> This feature allows the controller to send non-terminating/burst retry
> ACKs (Retry=1 and Nump!=0) as opposed to terminating retry ACKs
> (Retry=1 and Nump=0) to devices when a transaction error occurs.
>
> Unfortunately, some USB devices do not retry transfers when
> the controller sends them a non-terminating retry ACK. After the transfer
> times out, the xHCI driver tries to resume normal operation of the
> controller by sending a Stop Endpoint command to it. However, this
> revision of the controller fails to respond to that command in this
> state and the xHCI driver therefore gives up. This is reported via dmesg:
>
> [sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
> [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
> xhci-hcd: xHCI host not responding to stop endpoint command
> xhci-hcd: xHCI host controller not responding, assume dead
> xhci-hcd: HC died; cleaning up
>
> This problem has been observed on Odroid HC2. This board has
> an integrated JMS578 USB3-to-SATA bridge. The above problem could be
> triggered by starting a read-heavy workload on an attached SSD.
> After a while, the host controller would die and the SSD would disappear
> from the system.
>
> Reverting b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in
> the controller") stopped the issue from occurring on Odroid HC2.
> As this problem hasn't been reported on other devices, disable
> AutoRetry just for the dwc_usb3 revision v2.00a that the board SoC
> (Exynos 5422) uses.
>
> Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller")
> Link: https://urldefense.com/v3/__https://lore.kernel.org/r/a21f34c04632d250cd0a78c7c6f4a1c9c7a43142.camel@xxxxxxxxx/__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuAJ1zxnmw$
> Link: https://urldefense.com/v3/__https://forum.odroid.com/viewtopic.php?t=42630__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuCzOGIVPA$
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Mauro Ribeiro <mauro.ribeiro@xxxxxxxxxxxxxx>
> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
> Suggested-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
> Signed-off-by: Jakub Vanek <linuxtardis@xxxxxxxxx>
> ---
> drivers/usb/dwc3/core.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index d68958e151a7..d742fc882d7e 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -1218,9 +1218,15 @@ static int dwc3_core_init(struct dwc3 *dwc)
> * Host mode on seeing transaction errors(CRC errors or internal
> * overrun scenerios) on IN transfers to reply to the device
> * with a non-terminating retry ACK (i.e, an ACK transcation
> - * packet with Retry=1 & Nump != 0)
> + * packet with Retry=1 & Nump != 0).
> + *
> + * Do not enable this for DWC_usb3 v2.00a. This controller
> + * revision stops responding to Stop Endpoint commands if
> + * a USB device does not retry after a non-terminating retry
> + * ACK is sent to it.
> */
> - reg |= DWC3_GUCTL_HSTINAUTORETRY;
> + if (!DWC3_VER_IS(DWC3, 200A))
> + reg |= DWC3_GUCTL_HSTINAUTORETRY;
>
> dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
> }
> --
> 2.25.1
>

I brought up this inquiry internally. Please wait as I need to review
this further. The handling for this may be different depending on the
team's feedback.

Thanks,
Thinh