RE: [PATCH v3 5/5] r8152: Block future register access if register access fails

From: Hayes Wang
Date: Wed Oct 18 2023 - 07:41:43 EST


Doug Anderson <dianders@xxxxxxxxxxxx>
> Sent: Tuesday, October 17, 2023 10:17 PM
[...]
> > That is, the loop would be broken when the fail rate of the control transfer is high or low enough.
> > Otherwise, you would queue a usb reset again and again.
> > For example, if the fail rate of the control transfer is 10% ~ 60%,
> > I think you have high probability to keep the loop continually.
> > Would it never happen?
>
> Actually, even with a failure rate of 10% I don't think you'll end up
> with a fully continuous loop, right? All you need is to get 3 failures
> in a row in rtl8152_get_version() to get out of the loop. So with a
> 10% failure rate you'd unbind/bind 1000 times (on average) and then
> (finally) give up. With a 50% failure rate I think you'd only
> unbind/bind 8 times on average, right? Of course, I guess 1000 loops
> is pretty close to infinite.
>
> In any case, we haven't actually seen hardware that fails like this.
> We've seen failure rates that are much much lower and we can imagine
> failure rates that are 100% if we're got really broken hardware. Do
> you think cases where failure rates are middle-of-the-road are likely?

That is my question, too.
I don't know if something would cause the situation, either.
This is out of my knowledge.
I am waiting for the professional answers, too.

A lot of reasons may cause the fail of the control transfer.
I don't have all of the real situation to analyze them.
Therefore, what I could do is to assume different situations.
You could say my hypotheses are unreasonable.
However, I have to tell you what I worry.

> I would also say that nothing we can do can perfectly handle faulty
> hardware. If we're imagining theoretical hardware, we could imagine
> theoretical hardware that de-enumerated itself and re-enumerated
> itself every half second because the firmware on the device crashed or
> some regulator kept dropping. This faulty hardware would also cause an
> infinite loop of de-enumeration and re-enumeration, right?
>
> Presumably if we get into either case, the user will realize that the
> hardware isn't working and will unplug it from the system. While the

Some of our devices are onboard. That is, they couldn't be unplugged.
That is why I have to consider a lot of situations.

> system is doing the loop of trying to enumerate the hardware, it will
> be taking up a bunch of extra CPU cycles but (I believe) it won't be
> fully locked up or anything. The machine will still function and be
> able to do non-Ethernet activities, right? I would say that the worst
> thing about this state would be that it would stress corner cases in
> the reset of the USB subsystem, possibly ticking bugs.
>
> So I guess I would summarize all the above as:
>
> If hardware is broken in just the right way then this patch could
> cause a nearly infinite unbinding/rebinding of the r8152 driver.
> However:
>
> 1. It doesn't seem terribly likely for hardware to be broken in just this way.
>
> 2. We haven't seen hardware broken in just this way.
>
> 3. Hardware broken in a slightly different way could cause infinite
> unbinding/rebinding even without this patch.
>
> 4. Infinite unbinding/rebinding of a USB adapter isn't great, but not
> the absolute worst thing.

It is fine if everyone agrees these.

Best Regards,
Hayes