Re: cxacru usb_bulk_msg() firmware upload 36x slower with OHCI vs.UHCI

From: Simon Arlott
Date: Wed Nov 18 2009 - 15:02:45 EST


On 18/11/09 15:51, Alan Stern wrote:
> On Wed, 18 Nov 2009, Simon Arlott wrote:
>
>> Can anyone explain this speed difference I'm seeing with usb_bulk_msg()?
>>
>> Increasing the size of the data doesn't improve the speed at all. This
>> makes firmware loading take significantly longer with OHCI.
>>
>> I've tested this with two UHCI controllers, two OHCI controllers, and
>> two different versions of the device. Even on the same system with the
>> same device the speed difference occurs between OHCI and UHCI.
>>
>> I've added a debug line to drivers/usb/atm/cxacru.c in cxacru_fw(),
>> which uploads the firmware in PAGE_SIZE chunks:
>> + printk(KERN_INFO "cxacru: sending fw %#x size %#x to %#x", fw, offb, offd);
>> ret = usb_bulk_msg(usb_dev, usb_sndbulkpipe(usb_dev, CXACRU_EP_CMD),
>>
>>
>> With OHCI and PAGE_SIZE chunks (4KB):
>> [2567188.504299] cxacru: sending fw 0x3 size 0x1000 to 0xe00
>> [2567188.760293] cxacru: sending fw 0x3 size 0x1000 to 0x1c00
>> [2567189.016258] cxacru: sending fw 0x3 size 0x1000 to 0x2a00
>> [2567189.272235] cxacru: sending fw 0x3 size 0x1000 to 0x3800
>> [2567189.528210] cxacru: sending fw 0x3 size 0x1000 to 0x4600
>> [2567189.784488] cxacru: sending fw 0x3 size 0x1000 to 0x5400
>> ...
>> [2567233.044134] cxacru: sending fw 0x3 size 0x2c0 to 0x98668
>
> ...
>
>> With OHCI and PAGE_SIZE*8 chunks (32KB):
>> [ 4731.826738] cxacru: sending fw 0x3 size 0x8000 to 0x7000
>> [ 4733.874628] cxacru: sending fw 0x3 size 0x8000 to 0xe000
>> [ 4735.922353] cxacru: sending fw 0x3 size 0x8000 to 0x15000
>> [ 4737.970153] cxacru: sending fw 0x3 size 0x8000 to 0x1c000
>> [ 4740.017937] cxacru: sending fw 0x3 size 0x8000 to 0x23000
>> [ 4742.065814] cxacru: sending fw 0x3 size 0x8000 to 0x2a000
>> ...
>> [ 4774.830569] cxacru: sending fw 0x3 size 0x62c0 to #98668
>
> Each OHCI transfer requires about 256 ms per 4-KB page. There's no
> obvious reason for this to take so long. I'd guess it was caused by a
> hardware problem except that you say the same thing happened with two
> different controllers. Were they both on the same computer?

Yes. I have a UHCI controller on a PCI card, and used that with on-board
OHCI. Same computer (SMP), same kernel, same device...

Different computer (UP), on-board UHCI = 7ms per 4KB
A third computer (SMP), on-board OHCI = 256ms per 4KB

The device otherwise works ok (it's an ADSL modem) with no issues relating
to latency or throughput.

> You may have to do a bisection search to find the answer.

With what? I can't really bisect "the UHCI code" and "the OHCI code"...

I have no good kernel to work with unless I start trying really old kernels,
but there's no reason why those should work either. I'm hoping someone
recognises the significance of the transfer speed.

The OHCI code appears to split the data up into 4096 chunks, but even the
odd sized transfer of 25280 bytes at the end runs at the same speed:

[ 4774.830569] cxacru: sending fw 0x3 size 0x62c0 to #98668
[ 4776.410375] cxacru: sending fw 0x3 size 0x100 to #e0

6.410375-4.830569
1.579806
1579806/25280*4096
255968.56708860759493668864

--
Simon Arlott
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/