Re: dwc3: unusual handling of setup requests with wLength == 0

From: Thinh Nguyen
Date: Fri Aug 18 2023 - 20:08:13 EST


On Fri, Aug 18, 2023, Alan Stern wrote:
> On Fri, Aug 18, 2023 at 07:49:27PM +0000, Thinh Nguyen wrote:
> > On Thu, Aug 17, 2023, Alan Stern wrote:
> > > On Fri, Aug 18, 2023 at 03:10:48AM +0000, Thinh Nguyen wrote:
> > > > I was referring to the next request. It should not be processed until
> > > > the first request is completed. Depending on the type of request, if
> > > > there's a delayed_status, the dwc3 driver will not prepare for the
> > > > Status stage and Setup stage (after status completion) to proceed to the
> > > > _next_ ->setup callback.
> > > >
> > > > My understanding from the described problem is that somehow dwc3
> > > > processes the next request immediately without waiting for the raw
> > > > gadget preparing the data stage.
> > >
> > > Um. This is one of the design flaws I mentioned: a new SETUP packet
> > > arriving before the old control transfer is finished. The USB spec
> > > requires devices to accept the new SETUP packet and abort the old
> > > transfer. So in this case, processing the next request immediately is
> > > the right thing to do.
> >
> > No, as I've mentioned, from the gadget driver, it should not see the
> > next ->setup until the first control transfer completion, regardless
> > whether the transfer completed with error code due to abort or not.
> > Everything should happen sequentially from the gadget driver. This
> > should be handled in the dwc3 driver (though we missed a setup_pending
> > status check in the data phase that needs to be fixed now that I look at
> > it again).
>
> Actually I agree with you. When a new SETUP packet arrives before the
> old control transfer has finished, the UDC driver should cancel all
> pending requests and then invoke the ->setup() callback. (I don't think
> there is a standard error code for the cancelled requests; net2280 seems
> to use -EPROTO whereas dummy-hcd seems to use -EOVERFLOW.)

Those are very odd choice of error codes for cancelled request. Even
though the gadget side doesn't have very defined error codes, I try to
follow the equivalent doc from the host side
(driver-api/usb/error-codes.rst), which is -ECONNRESET.

Whenever I see -EPROTO, I associate that to a specific host error:
transaction error. For -EOVERFLOW, I think of babble errors.

>
> > Perhaps the core design should be so that this synchronization does not
> > depend on the udc driver implementation.
>
> Do you mean the UDC core? I don't see how it could get involved in
> managing control transfers.

Actually I don't see it either. :)

>
> > > One question is why Andrey is observing a new ->setup() callback
> > > happening so soon? The host is supposed to allow a fairly long time for
> > > standard control requests to complete. If the userspace component of
> > > the Raw Gadget takes too long to act, the transfer could time out and be
> > > cancelled on the host. But "too long" means several seconds -- is that
> > > really what's going on here?
> >
> > As I noted initially, it should not happen so I'm not sure what's really
> > the problem here. Granted that the setup_pending check for data phase is
> > missing in dwc3 driver, there should not be a problem unless the host
> > actually aborted a control transfer. Also, there should be no data phase
> > for wlength=0 even for IN direction if we go through the composite
> > layer. So, I doubt that's what happening here.
> >
> > Perhaps Andrey can clarify.
>
> My impression from his initial email was that something different was
> happening. It sounded like his ->setup() callback was invoked with
> wLength = 0, but before the Raw Gadget driver was ready to queue a
> response request, the UDC driver sent an automatic status, at which
> point the host sent another SETUP packet. So the gadget driver got two
> ->setup() callbacks in a row with no chance to do anything in between.

What else should the gadget driver do? There's no data stage for
wLength=0.

>
> > > > I was talking in context of 0-length transfer (albeit I forgot about the
> > > > special case of control OUT doesn't have 3-stage).
> > > >
> > > > If it's a vendor request 0-length transfer, without responding with
> > > > USB_GADGET_DELAYED_STATUS, the dwc3 will proceed with preparing the
> > > > status stage.
> > >
> > > This may be a holdover from the early days of the Gadget subsystem. My
> > > memory from back then isn't very good; I vaguely recall that the first
> > > UDC drivers would queue their automatic Status-stage requests if wLength
> > > was 0 and ->setup() returned 0 (which would explain why
> > > USB_GADGET_DELAYED_STATUS had to be invented). Unless I'm completely
> > > confused, that's not how UDC drivers are supposed to act now.
>
> I did a little checking. The USB_GADGET_DELAYED_STATUS mechanism was
> introduced for use by the composite framework -- not in order to make a
> UDC driver work properly.

Hm... perhaps we can update so that it's applicable outside of the
composite framework. Currently dwc3 treats it as such, which may not
work if the gadget driver does not know about USB_GADGET_DELAYED_STATUS.

>
> > For dwc3, it's been like this since the beginning that it automatically
> > queues the status upon host request. USB_GADGET_DELAYED_STATUS was
> > introduced to support scenarios where the device may need a longer time
> > to process the specific request (such as SET_CONFIGURATION).
>
> It's hard to be sure, but that may be the cause of Andrey's problem. He
> mentioned something about the Raw Gadget's ->setup() routine returning
> USB_GADGET_DELAYED_STATUS when the problem occurred, but I think he
> meant that it did this for the second callback, not the first one.

I think he was just experimenting with USB_GADGET_DELAYED_STATUS. The
problem happened without that. Regardless, we need clarifications.

>
> Still, I recommand that dwc3 not automatically queue a Status request
> when wLength is 0. Gadget drivers don't expect UDC drivers to handle
> this for them. For example, look at the composite_setup() function in
> composite.c. It does this:
>
> /* respond with data transfer before status phase? */
> if (value >= 0 && value != USB_GADGET_DELAYED_STATUS) {
> req->length = value;
> req->context = cdev;
> req->zero = value < w_length;
> value = composite_ep0_queue(cdev, req, GFP_ATOMIC);
>
> Clearly it doesn't want the UDC driver to queue a request for it, no
> matter what wLength or value is set to.
>

dwc3 parse the SETUP data and determine whether it's a 3-state or
2-stage control transfer. If wLength > 0, then it must be a 3-stage
control transfer. The dwc3 driver would not queue the status immediately
until the data stage is completed.

To enforce the gadget driver to manually queue the status would take
some effort to ensure all the UDC driver comply to it. We would also
need to update the composite framework.

BR,
Thinh