Re: [PATCH v2] usb: dwc3: Trigger a GCTL soft reset when switching modes in DRD

From: Felipe Balbi
Date: Fri Oct 23 2020 - 03:02:18 EST



Hi,

John Stultz <john.stultz@xxxxxxxxxx> writes:
> On Thu, Oct 22, 2020 at 12:55 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote:
>> John Stultz <john.stultz@xxxxxxxxxx> writes:
>> > From: Yu Chen <chenyu56@xxxxxxxxxx>
>> >
>> > With the current dwc3 code on the HiKey960 we often see the
>> > COREIDLE flag get stuck off in __dwc3_gadget_start(), which
>> > seems to prevent the reset irq and causes the USB gadget to
>> > fail to initialize.
>> >
>> > We had seen occasional initialization failures with older
>> > kernels but with recent 5.x era kernels it seemed to be becoming
>> > much more common, so I dug back through some older trees and
>> > realized I dropped this quirk from Yu Chen during upstreaming
>> > as I couldn't provide a proper rational for it and it didn't
>> > seem to be necessary. I now realize I was wrong.
>>
>> This keeps coming back every few years. It has never been necessary so
>> far. Why is it necessary now?
>
> Sorry, I'm not totally sure I've got all the context here. If you mean
> with regards to the HiKey960, it's because the HiKey960 had a somewhat

it's a general DWC3 thing. The databook claims that a soft reset is
necessary, but it turns out it isn't :-)

> complicated vendor patch stack that others and I had been carrying
> along and trying to upstream slowly over the last few years. Since
> that process of upstreaming required lots of rework, the patch set
> changed over time fixing a number of issues and in this case (by
> dropping the quirk) introducing others.
>
> The usb functionality on the board was never perfect. As I said in
> the patch, we saw initialization issues *very* rarely with older
> kernels - which I suspected was due to the oddball mux/hub driver that
> had to be deeply reworked - so the issue was easy to overlook, except
> the frequency of it had grown to be quite noticeable. So now that all
> but the dts bits are upstream, I've been trying to spend occasional
> free cycles figuring out what's wrong.
>
> That's when I figured out it was the quirk fix I dropped. But the
> good news is so far with it I've not hit any initialization issues
> (over a few hundred reboots).

That's good :-)

>> The only thing we need to do is verify
>> which registers are shadowed between host and peripheral roles and cache
>> only those registers.
>
> Sorry, could you explain this a bit more? Again, I don't have access
> to the hardware docs, so I'm just working with the source and any
> vendor patches I can find.

Right, initialize it in gadget mode, then take a register dump (I think
our regdump facility in dwc3's debugfs is enough). Then flip to host
mode and take the same register dump. Now diff them. You'll see that
some registers get overwritten. The reason for that is that physically
some host and peripheral registers map to the same block of memory in
the IP. In other words, the address decoder in the Register File decodes
some addresses to the same physical block of memory. This was done, I
believe, to save die area by reducing gate count.

>> A full soft reset will take a while and is likely to create other
>> issues.
>
> I'm also fine with going back to the quirk approach if you think that
> would be lower risk to other devices?

I think the soft reset can have unexpected side effects here.

--
balbi

Attachment: signature.asc
Description: PGP signature