Re: [PATCH v3 5/5] r8152: Block future register access if register access fails

From: Doug Anderson
Date: Thu Oct 19 2023 - 11:42:32 EST


Hi,

On Wed, Oct 18, 2023 at 4:41 AM Hayes Wang <hayeswang@xxxxxxxxxxx> wrote:
>
> > In any case, we haven't actually seen hardware that fails like this.
> > We've seen failure rates that are much much lower and we can imagine
> > failure rates that are 100% if we're got really broken hardware. Do
> > you think cases where failure rates are middle-of-the-road are likely?
>
> That is my question, too.
> I don't know if something would cause the situation, either.
> This is out of my knowledge.
> I am waiting for the professional answers, too.
>
> A lot of reasons may cause the fail of the control transfer.
> I don't have all of the real situation to analyze them.
> Therefore, what I could do is to assume different situations.
> You could say my hypotheses are unreasonable.
> However, I have to tell you what I worry.

Of course! ...and I appreciate your thoughts on the topic. The more
eyes on a patch the more problems that are caught. Unless someone
disagrees, I think we at least have ideas for how this could be
addressed if it comes up. Also unless someone disagrees, I think that
if this does come up in some situation it won't be a catastrophe.

Given how things look now, I'm going to plan to send a new version of
the patch later today. Though the commit message is long, I'll add a
little more to talk about this case and point to ideas for how it
could be solved if it comes up.


> > I would also say that nothing we can do can perfectly handle faulty
> > hardware. If we're imagining theoretical hardware, we could imagine
> > theoretical hardware that de-enumerated itself and re-enumerated
> > itself every half second because the firmware on the device crashed or
> > some regulator kept dropping. This faulty hardware would also cause an
> > infinite loop of de-enumeration and re-enumeration, right?
> >
> > Presumably if we get into either case, the user will realize that the
> > hardware isn't working and will unplug it from the system. While the
>
> Some of our devices are onboard. That is, they couldn't be unplugged.
> That is why I have to consider a lot of situations.

Good point! I think even with onboard devices we could already have
preexisting conditions that could cause an unbind/rebind loop. This
would be a new condition, of course.


-Doug