Re: [PATCH v2] usb: dwc3: Trigger a GCTL soft reset when switching modes in DRD

From: John Stultz
Date: Thu Oct 22 2020 - 15:46:34 EST


On Thu, Oct 22, 2020 at 12:55 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote:
> John Stultz <john.stultz@xxxxxxxxxx> writes:
> > From: Yu Chen <chenyu56@xxxxxxxxxx>
> >
> > With the current dwc3 code on the HiKey960 we often see the
> > COREIDLE flag get stuck off in __dwc3_gadget_start(), which
> > seems to prevent the reset irq and causes the USB gadget to
> > fail to initialize.
> >
> > We had seen occasional initialization failures with older
> > kernels but with recent 5.x era kernels it seemed to be becoming
> > much more common, so I dug back through some older trees and
> > realized I dropped this quirk from Yu Chen during upstreaming
> > as I couldn't provide a proper rational for it and it didn't
> > seem to be necessary. I now realize I was wrong.
>
> This keeps coming back every few years. It has never been necessary so
> far. Why is it necessary now?

Sorry, I'm not totally sure I've got all the context here. If you mean
with regards to the HiKey960, it's because the HiKey960 had a somewhat
complicated vendor patch stack that others and I had been carrying
along and trying to upstream slowly over the last few years. Since
that process of upstreaming required lots of rework, the patch set
changed over time fixing a number of issues and in this case (by
dropping the quirk) introducing others.

The usb functionality on the board was never perfect. As I said in
the patch, we saw initialization issues *very* rarely with older
kernels - which I suspected was due to the oddball mux/hub driver that
had to be deeply reworked - so the issue was easy to overlook, except
the frequency of it had grown to be quite noticeable. So now that all
but the dts bits are upstream, I've been trying to spend occasional
free cycles figuring out what's wrong.

That's when I figured out it was the quirk fix I dropped. But the
good news is so far with it I've not hit any initialization issues
(over a few hundred reboots).

> The only thing we need to do is verify
> which registers are shadowed between host and peripheral roles and cache
> only those registers.

Sorry, could you explain this a bit more? Again, I don't have access
to the hardware docs, so I'm just working with the source and any
vendor patches I can find.

> A full soft reset will take a while and is likely to create other
> issues.

I'm also fine with going back to the quirk approach if you think that
would be lower risk to other devices?

thanks
-john