Re: [PATCH v2 0/3] uvc gadget performance issues

From: Michael Grzeschik
Date: Tue Oct 11 2022 - 15:48:29 EST


Hi Dan!

Thanks for the patches.

On Tue, Oct 11, 2022 at 01:34:32PM -0500, Dan Vacura wrote:
Hello uvc gadget developers,

Please find my V2 series with added patches to disable these performance
features at the userspace level for devices that don't work well with
the UDC hw, i.e. dwc3 in this case. Also included are updates to
comments for the v1 patch.

Original note:

I'm working on a 5.15.41 based kernel on a qcom chipset with the dwc3
controller and I'm encountering two problems related to the recent performance
improvement changes:

https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-5-m.grzeschik@xxxxxxxxxxxxxx/ and
https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-6-m.grzeschik@xxxxxxxxxxxxxx/

If I revert these two changes, then I have much improved stability and a
transmission problem I'm seeing is gone. Has there been any success from
others on 5.15 with this uvc improvement and any recommendations for my
current problems? Those being:

1) a smmu panic, snippet here: 

<3>[  718.314900][  T803] arm-smmu 15000000.apps-smmu: Unhandled arm-smmu context fault from a600000.dwc3!
<3>[  718.314994][  T803] arm-smmu 15000000.apps-smmu: FAR    = 0x00000000efe60800
<3>[  718.315023][  T803] arm-smmu 15000000.apps-smmu: PAR    = 0x0000000000000000
<3>[  718.315048][  T803] arm-smmu 15000000.apps-smmu: FSR    = 0x40000402 [TF R SS ]
<3>[  718.315074][  T803] arm-smmu 15000000.apps-smmu: FSYNR0    = 0x5f0003
<3>[  718.315096][  T803] arm-smmu 15000000.apps-smmu: FSYNR1    = 0xaa02
<3>[  718.315117][  T803] arm-smmu 15000000.apps-smmu: context bank#    = 0x1b
<3>[  718.315141][  T803] arm-smmu 15000000.apps-smmu: TTBR0  = 0x001b0000c2a92000
<3>[  718.315165][  T803] arm-smmu 15000000.apps-smmu: TTBR1  = 0x001b000000000000
<3>[  718.315192][  T803] arm-smmu 15000000.apps-smmu: SCTLR  = 0x0a5f00e7 ACTLR  = 0x00000003
<3>[  718.315245][  T803] arm-smmu 15000000.apps-smmu: CBAR  = 0x0001f300
<3>[  718.315274][  T803] arm-smmu 15000000.apps-smmu: MAIR0   = 0xf404ff44 MAIR1   = 0x0000efe4
<3>[  718.315297][  T803] arm-smmu 15000000.apps-smmu: SID = 0x40
<3>[  718.315318][  T803] arm-smmu 15000000.apps-smmu: Client info: BID=0x5, PID=0xa, MID=0x2
<3>[  718.315377][  T803] arm-smmu 15000000.apps-smmu: soft iova-to-phys=0x0000000000000000

I can reduce this panic with the proposed patch, but it still happens until I
disable the "req->no_interrupt = 1" logic.

This actually smells very much like an race between hardware and
software, that is probably working on the same memory. I would guess
that the hardware in the non interrupt case is currently processing
queued memory, while at the same time the software stack will update
that same memory with new data.

In my opinion this should be fixed, rather then making the interrupt
load optional. Also we could discuss if an option to adjust the load
adds some extra value, but out of this issue scope you describe here.

Also, is this issue also being more likely to happen when streaming YUYV?

2) The frame is not fully transmitted in dwc3 with sg support enabled.

There seems to be a mapping limit I'm seeing where only the roughly first
70% of the total frame is sent. Interestingly, if I allocate a larger
size for the buffer upfront, in uvc_queue_setup(), like sizes[0] =
video->imagesize * 3. Then the issue rarely happens. For example, when I
do YUYV I see green, uninitialized data, at the bottom part of the
frame. If I do MJPG with smaller filled sizes, the transmission is fine.

+-------------------------+
| |
| |
| |
| Good data |
| |
| |
| |
+-------------------------+
|xxxxxxxxxxxxxxxxxxxxxxxxx|
|xxxx Bad data xxxxxxxxx|
|xxxxxxxxxxxxxxxxxxxxxxxxx|
+-------------------------+


I did not stream with YUYV for some time. I will do that and try to
reproduce the issues you describe.

I also have an patch in the queue that will limit the sg support for
devices with speed > HIGH_SPEED. Because of the overhead of the limited
transfer payload of 1024*3 Bytes, it is possible that a simple memcpy
will actually be fast enough. But for that patch I still have to make
proper measurements. Btw. which USB speed are you transferring with?

Regards,
Michael

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature