Re: [PATCH] drivers:usb:disable usb hub&port async suspend to avoid block system PM

From: Greg Kroah-Hartman
Date: Tue Feb 20 2024 - 02:48:01 EST


On Mon, Feb 19, 2024 at 11:24:13PM -0800, buckzhangwh@xxxxxxxxx wrote:
> From: weihui zhang <buckzhangwh@xxxxxxxxx>
>
> many phones are crashed and unable to wake up by power key.
> We analyzed more than ten kernel-dumps,

Odd trailing whitespace :(

> the common was that system were all blocked by usb.
> the phone doesn't crash again,after we disable usb hub&port async suspend.

That's not ok to disable, why is this needed?


>
> here is one kernel-dump analysis case:
> task 446 & 4511 &365 state are Uninterrupt
> task 446 dpm_wait_for_superior then schedule out.
>
> PID: 446  TASK: ffffff81f3e9cb00 CPU: 4 COMMAND: "charge"
>
>  #0 [ffffffc015a0b9a0] __switch_to at ffffffc010088b54
>
>  #1 [ffffffc015a0b9f0] __schedule at ffffffc010f7ef2c
>
>  #2 [ffffffc015a0ba40] schedule at ffffffc010f7f3a4
>
>  #3 [ffffffc015a0baa0] async_synchronize_cookie_domain at
>
>  #4 [ffffffc015a0bb00] async_synchronize_full at ffffffc010131e0c
>
>  #5 [ffffffc015a0bb10] dpm_resume at ffffffc0107ec408
>
>  #6 [ffffffc015a0bb70] dpm_resume_end at ffffffc0107ed0e8
>
>  #7 [ffffffc015a0bbb0] suspend_devices_and_enter at ffffffc0101776d8
>
>  #8 [ffffffc015a0bbf0] enter_state at ffffffc010177e70
>
>  #9 [ffffffc015a0bc20] pm_suspend at ffffffc010177d5c
>
> PID: 4511 TASK: ffffff8153b0e740 CPU: 6 COMMAND: "kworker/u16:11"
>
>  #0 [ffffffc01bc9baa0] __switch_to at ffffffc01010293c
>
>  #1 [ffffffc01bc9bb10] __schedule at ffffffc0116b0008
>
>  #2 [ffffffc01bc9bb70] schedule at ffffffc0116b0794
>
>  #3 [ffffffc01bc9bbe0] schedule_timeout at ffffffc0116b6c2c
>
>  #4 [ffffffc01bc9bc40] wait_for_common at ffffffc0116b198c
>
>  #5 [ffffffc01bc9bca0] dpm_wait_for_superior at ffffffc010b999e8  
>
>  #6 [ffffffc01bc9bce0] device_resume at ffffffc010b9d804
> //x0(struct device)= ffffff81f32a3808
>
>  #7 [ffffffc01bc9bd10] async_resume at ffffffc010b9d6ec
>
>  #8 [ffffffc01bc9bd40] async_run_entry_fn at ffffffc0101e238c
>
>  #9 [ffffffc01bc9bdb0] process_one_work at ffffffc0101d00e0
>
> crash_arm64> struct device ffffff81f32a3808 -x
>
> struct device {
>
>   kobj = {
>
>     name = 0xffffff81f15c6d80 "usb1-port1",
>
>     entry = {
>
>       next = 0xffffff81f32a6018,
>
>       prev = 0xffffff81e4103838
>
>     },
>  type = 0xffffffc0124102b8 <usb_port_device_type>,
>
>   bus = 0x0,
>
>   driver = 0xffffffc0124102e8 <usb_port_driver>,
>
>   async_suspend = 0x1,
> ...........

All of the changelog text has extra lines? Well some of them do, why?

Please fix that up.

>
> PID: 365 TASK: ffffff81f3bf6900 CPU: 0 COMMAND: "kworker/u16:5"
>
>  #0 [ffffffc0158dbac0] __switch_to at ffffffc010088b54
>
>  #1 [ffffffc0158dbb10] __schedule at ffffffc010f7ef2c
>
>  #2 [ffffffc0158dbb60] schedule at ffffffc010f7f3a4
>
>  #3 [ffffffc0158dbbe0] schedule_timeout at ffffffc010f83fd8
>
>  #4 [ffffffc0158dbc40] wait_for_common at ffffffc010f80600
>
>  #5 [ffffffc0158dbc90] wait_for_completion at ffffffc010f8051c
>
>  #6 [ffffffc0158dbcb0] dpm_wait_for_superior at ffffffc0107eee00
>
>  #7 [ffffffc0158dbd00] device_resume at ffffffc0107ec6a8
> //x0(struct device) = 0xffffff81ee1b3030
>
>  #8 [ffffffc0158dbd40] async_resume at ffffffc0107ec588
>
>  #9 [ffffffc0158dbd60] async_run_entry_fn at ffffffc010131c60
>
>  crash_arm64> struct device ffffff81ee1b3030 -x
>
> struct device {
>
>   kobj = {
>
>     name = 0xffffff81f1fc1500 "1-0:1.0",
>
>   type = 0xffffffc011b62a38 <usb_if_device_type>,
> ............

I don't understand, why the kobject structure stuff here?


>
> Signed-off-by: weihui zhang <buckzhangwh@xxxxxxxxx>
> ---
> drivers/usb/core/hub.c | 2 +-
> drivers/usb/core/message.c | 2 +-
> drivers/usb/core/port.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index e38a4124f..de74f70e5 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -2602,7 +2602,7 @@ int usb_new_device(struct usb_device *udev)
> add_device_randomness(udev->manufacturer,
> strlen(udev->manufacturer));
>
> - device_enable_async_suspend(&udev->dev);
> + device_disable_async_suspend(&udev->dev);

Why? You just changed the way the whole system worked?

Are you sure that the device itself shouldn't be fixed instead of doing
this for ALL devices?

>
> /* check whether the hub or firmware marks this port as non-removable */
> set_usb_port_removable(udev);
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 077dfe48d..944f01aa7 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -2203,7 +2203,7 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
> "adding %s (config #%d, interface %d)\n",
> dev_name(&intf->dev), configuration,
> intf->cur_altsetting->desc.bInterfaceNumber);
> - device_enable_async_suspend(&intf->dev);
> + device_disable_async_suspend(&intf->dev);
> ret = device_add(&intf->dev);
> if (ret != 0) {
> dev_err(&dev->dev, "device_add(%s) --> %d\n",
> diff --git a/drivers/usb/core/port.c b/drivers/usb/core/port.c
> index c628c1abc..97696c415 100644
> --- a/drivers/usb/core/port.c
> +++ b/drivers/usb/core/port.c
> @@ -760,7 +760,7 @@ int usb_hub_create_port_device(struct usb_hub *hub, int port1)
> pm_runtime_set_active(&port_dev->dev);
> pm_runtime_get_noresume(&port_dev->dev);
> pm_runtime_enable(&port_dev->dev);
> - device_enable_async_suspend(&port_dev->dev);
> + device_disable_async_suspend(&port_dev->dev);

Again, I don't think this is a wise change, how did you test it?

thanks,

greg k-h