Re: [PATCH v2 1/2] usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK

From: Prashanth K
Date: Tue Dec 26 2023 - 00:25:16 EST




On 22-12-23 11:40 am, Greg Kroah-Hartman wrote:
On Fri, Dec 22, 2023 at 11:29:01AM +0530, Prashanth K wrote:
On 15-12-23 06:12 pm, Greg Kroah-Hartman wrote:
On Tue, Dec 12, 2023 at 04:55:20PM +0530, Prashanth K wrote:
Upstream commit bac1ec551434 ("usb: xhci: Set quirk for
XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI
which fixes XHC timeout, which was seen on synopsys XHCs while
using SG buffers. But the support for this quirk isn't present
in the DWC3 layer.

We will encounter this XHCI timeout/hung issue if we run iperf
loopback tests using RTL8156 ethernet adaptor on DWC3 targets
with scatter-gather enabled. This gets resolved after enabling
the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using
the xhci device property since its needed for DWC3 controller.

In Synopsys DWC3 databook,
Table 9-3: xHCI Debug Capability Limitations
Chained TRBs greater than TRB cache size: The debug capability
driver must not create a multi-TRB TD that describes smaller
than a 1K packet that spreads across 8 or more TRBs on either
the IN TR or the OUT TR.

Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Prashanth K <quic_prashk@xxxxxxxxxxx>

What commit id does this fix?

This doesn't fix any commit as such, but adds the support for
XHCI_SG_TRB_CACHE_SIZE_QUIRK (which is present in XHCI layer) to DWC3 layer.

So this is a new feature?

How does this fit into the stable kernel rules?

This isn't a new feature. To give some background, upstream commit bac1ec551434 ("usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK")
added a XHCI quirk which converts SG lists to CMA buffers/URBS if certain conditions aren't met. But they never enabled this xhci quirk
since no issues were hit at that time. So, the support for the above mentioned quirk is added from 5.11 kernel onwards, but was never enabled anywhere.

From commit bac1ec551434 : "We discovered this issue with devices on other platforms but have not yet come across any device that triggers this on Linux. But it could be a real problem now or in the future. All it takes is N number of small chained TRBs. And other instances of the Synopsys IP may have smaller values for the TRB_CACHE_SIZE which would exacerbate the problem."

For more info: https://lore.kernel.org/all/20201208092912.1773650-3-mathias.nyman@xxxxxxxxxxxxxxx/


I have CC'ed stable kernel for this to be back-ported to older kernels
(#5.11).

Why that specific kernel version and newer? Why not list it as
documented?

I mentioned 5.11 because commit bac1ec551434 ("usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK") is present from 5.11.


---
drivers/usb/dwc3/host.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
index 61f57fe5bb78..31a496233d87 100644
--- a/drivers/usb/dwc3/host.c
+++ b/drivers/usb/dwc3/host.c
@@ -89,6 +89,8 @@ int dwc3_host_init(struct dwc3 *dwc)
memset(props, 0, sizeof(struct property_entry) * ARRAY_SIZE(props));
+ props[prop_idx++] = PROPERTY_ENTRY_BOOL("xhci-sg-trb-cache-size-quirk");

And this is ok if the entry is not present?

We are intending to use this quirk for all the dwc3 based devices since the
DWC3 XHC needs it.

So you do not have this quirk yet in the kernel tree? We can't take
code without any in-tree users.

This is a 2 patch series, patch 1/2 sets a property from dwc3 layer. And patch 2 enables XHCI quirk based on the property set from DWC3.

If the entry is not present then we will hit stall if
certain conditions aren't met (have mentioned the condition in commit text).

When will the quirk be added? To what platforms?

I guess there is some sort of confusion here, sorry for that.

Earlier Tejas Joglekar from synopsys pushed a patch in XHCI layer which converts certain SG lists to CMA buffers if some pre-requisites aren't met. And this operation is done if an xhci->quirk is set (XHCI_SG_TRB_CACHE_SIZE_QUIRK - BIT39)

- https://lore.kernel.org/all/20201208092912.1773650-2-mathias.nyman@xxxxxxxxxxxxxxx/

- https://lore.kernel.org/all/20201208092912.1773650-3-mathias.nyman@xxxxxxxxxxxxxxx/

But here the option to enable this quirk was done using XHCI priv data

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index aa2d35f98200..4d34f6005381 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -333,6 +333,9 @@ static int xhci_plat_probe(struct platform_device *pdev)
if (priv && (priv->quirks & XHCI_SKIP_PHY_INIT))
hcd->skip_phy_initialization = 1;

+ if (priv && (priv->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK))
+ xhci->quirks |= XHCI_SG_TRB_CACHE_SIZE_QUIRK;
+
ret = usb_add_hcd(hcd, irq, IRQF_SHARED);
if (ret)
goto disable_usb_phy;


And this XHCI quirk (XHCI_SG_TRB_CACHE_SIZE_QUIRK) needs to be enabled for DWC3 controllers. There are 2 ways to do it. One way is by directly accessing XHCI private data from DWC3 layer (dwc3/host.c) which is not cleaner approach.

So I'm reusing the device_create_managed_software_node() which is present in dwc3/host.c to add a quirk to XHCI node, and enable XHCI_SG_TRB_CACHE_SIZE_QUIRK based on property set from DWC3 layer.

Thanks,
Prashanth K