Re: Re: Re: Re: Re: Re: [syzbot] INFO: rcu detected stall in tx

From: Alan Stern
Date: Wed May 19 2021 - 13:35:48 EST


On Wed, May 19, 2021 at 04:14:20PM +0000, Guido Kiener wrote:
> > On Wed, May 19, 2021 at 10:48:29AM +0200, dave penkler wrote:
> > > On Sat, 8 May 2021 at 16:29, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Sat, May 08, 2021 at 10:14:41AM +0200, dave penkler wrote:
> > > > > When the host driver detects a protocol error while processing an
> > > > > URB it completes the URB with EPROTO status and marks the endpoint
> > > > > as halted.
> > > >
> > > > Not true. It does not mark the endpoint as halted, not unless it
> > > > receives a STALL handshake from the device. A STALL is not a
> > > > protocol error.
> > > >
> > > > > When the class driver resubmits the URB and the if the host driver
> > > > > finds the endpoint still marked as halted it should return EPIPE
> > > > > status on the resubmitted URB
> > > >
> > > > Irrelevant.
> > > Not at all. The point is that when an application is talking to an
> > > instrument over the usbtmc driver, the underlying host controller and
> > > its driver will detect and silence a babbling endpoint.
> >
> > No, they won't. That is, they will detect a babble error and return an error status, but
> > they won't silence the endpoint. What makes you think they will?
>
> Maybe there is a misunderstanding. I guess that Dave wanted to propose:
> "EPROTO is a link level issue and needs to be handled by the host driver.
> When the host driver detects a protocol error while processing an
> URB it SHOULD complete the URB with EPROTO status

The host controller drivers _do_ complete URBs with -EPROTO (or similar)
status when a link-level error occurs...

> and SHOULD mark the endpoint
> as halted."

but they don't mark the endpoint as halted. Even if they did, it
wouldn't fix anything because the kernel allows URBs to be submitted to
halted endpoints. In fact, it doesn't even keep track of which
endpoints are or are not halted.

> Is this a realistic fix for all host drivers?

No, it isn't.

An endpoint shouldn't be marked as halted unless it really is halted.
Otherwise a driver might be tempted to clear the Halt feature, and
some devices do not like to receive a Clear-Halt request for an endpoint
that isn't halted.

What we could do is what you suggested earlier: Note the fact that the
endpoint is in some sort of fault condition and disallow further
communication with the endpoint until the fault condition has been
cleared. (It isn't entirely obvious exactly what actions should clear
such a fault... I guess resetting or re-enabling the endpoint, or
resetting the entire device.)

Alan Stern