Re: [PATCH] USB:bugfix a controller halt error

From: Alan Stern
Date: Fri Jul 21 2023 - 10:57:32 EST


On Fri, Jul 21, 2023 at 06:00:15PM +0800, liulongfang wrote:
> On systems that use ECC memory. The ECC error of the memory will
> cause the USB controller to halt. It causes the usb_control_msg()
> operation to fail.

How often does this happen in real life? (Besides, it seems to me that
if your system is getting a bunch of ECC memory errors then you've got
much worse problems than a simple USB failure!)

And why do you worry about ECC memory failures in particular? Can't
_any_ kind of failure cause the usb_control_msg() operation to fail?

> At this point, the returned buffer data is an abnormal value, and
> continuing to use it will lead to incorrect results.

The driver already contains code to check for abnormal values. The
check is not perfect, but it should prevent things from going too
badly wrong.

> Therefore, it is necessary to judge the return value and exit.
>
> Signed-off-by: liulongfang <liulongfang@xxxxxxxxxx>

There is a flaw in your reasoning.

The operation carried out here is deliberately unsafe (for full-speed
devices). It is made before we know the actual maxpacket size for ep0,
and as a result it might return an error code even when it works okay.
This shouldn't happen, but a lot of USB hardware is unreliable.

Therefore we must not ignore the result merely because r < 0. If we do
that, the kernel might stop working with some devices.

Alan Stern

> ---
> drivers/usb/core/hub.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index a739403a9e45..6a43198be263 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -4891,6 +4891,16 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
> USB_DT_DEVICE << 8, 0,
> buf, GET_DESCRIPTOR_BUFSIZE,
> initial_descriptor_timeout);
> + /* On systems that use ECC memory, ECC errors can
> + * cause the USB controller to halt.
> + * It causes this operation to fail. At this time,
> + * the buf data is an abnormal value and needs to be exited.
> + */
> + if (r < 0) {
> + kfree(buf);
> + goto fail;
> + }
> +
> switch (buf->bMaxPacketSize0) {
> case 8: case 16: case 32: case 64: case 255:
> if (buf->bDescriptorType ==
> --
> 2.24.0
>