Re: Shutdown hang on Cavium thunderX eMMC

From: Ulf Hansson
Date: Thu Feb 17 2022 - 09:42:59 EST


On Tue, 15 Feb 2022 at 10:52, Daniel Danzberger <daniel@xxxxxxxxxx> wrote:
>
> Hi,
>
> the below commit causes a shutodown hang on my octeontx platforms
> (aarch64) with Cavium ThunderX eMMC
>
> --
> commit 66c915d09b942fb3b2b0cb2f56562180901fba17
> Author: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> Date: Fri Dec 3 15:15:54 2021 +0100
>
> mmc: core: Disable card detect during shutdown
> --
>
> On shutdown, the __mmc_stop_host() call blocks by waiting for
> mmc_detect() to complete, but it never does.
> The second stack trace below shows it's been waiting forever for an
> mmc_send_status() request to complete.

Looks like the root to the problem is that the mmc_send_status()
request is hanging the cavium mmc host driver.

Is that instance of the mmc host driver functional at all? I mean, it
looks like the host driver is hanging already before the system is
being shutdown, right?

Kind regards
Uffe

>
>
> [ 394.251271] INFO: task procd:2715 blocked for more than 153 seconds.
> [ 394.257635] Not tainted 5.10.96 #0
> [ 394.261552] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 394.269389] task:procd state:D stack: 0 pid: 2715 ppid:
> 1 flags:0x00000000
> [ 394.277749] dump_backtrace(regs = 0000000000000000 tsk =
> 000000003cc20742)
> [ 394.284625] Call trace:
> [ 394.287069] __switch_to+0x80/0xc0
> [ 394.290467] __schedule+0x1f8/0x530
> [ 394.293961] schedule+0x48/0xd0
> [ 394.297099] schedule_timeout+0x98/0xd0
> [ 394.300931] __wait_for_common+0xc4/0x1c4
> [ 394.304956] wait_for_completion+0x20/0x2c
> [ 394.309050] __flush_work.isra.0+0x184/0x31c
> [ 394.313329] __cancel_work_timer+0xfc/0x170
> [ 394.317510] cancel_delayed_work_sync+0x14/0x20
> [ 394.322038] __mmc_stop_host+0x3c/0x50
> [ 394.325799] mmc_host_classdev_shutdown+0x14/0x24
> [ 394.330500] device_shutdown+0x120/0x250
> [ 394.334430] __do_sys_reboot+0x1ec/0x290
> [ 394.338350] __arm64_sys_reboot+0x24/0x30
> [ 394.342356] do_el0_svc+0x74/0x120
> [ 394.345765] el0_svc+0x14/0x20
> [ 394.348817] el0_sync_handler+0xa4/0x140
> [ 394.352736] el0_sync+0x164/0x180
>
>
> [ 735.262749] INFO: task kworker/0:0:5 blocked for more than 614
> seconds.
> [ 735.269363] Not tainted 5.10.96 #0
> [ 735.273296] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 735.281121] task:kworker/0:0 state:D stack: 0 pid: 5 ppid:
> 2 flags:0x00000028
> [ 735.289490] Workqueue: events_freezable mmc_rescan
> [ 735.294288] Call trace:
> [ 735.296732] __switch_to+0x80/0xc0
> [ 735.300131] __schedule+0x1f8/0x530
> [ 735.303623] schedule+0x48/0xd0
> [ 735.306761] schedule_timeout+0x98/0xd0
> [ 735.310593] __wait_for_common+0xc4/0x1c4
> [ 735.314606] wait_for_completion+0x20/0x2c
> [ 735.318699] mmc_wait_for_req_done+0x2c/0x100
> [ 735.323065] mmc_wait_for_req+0xb0/0x100
> [ 735.326984] mmc_wait_for_cmd+0x54/0x7c
> [ 735.330818] mmc_send_status+0x5c/0x80
> [ 735.334573] mmc_alive+0x18/0x24
> [ 735.337798] _mmc_detect_card_removed+0x34/0x150
> [ 735.342412] mmc_detect+0x28/0x90
> [ 735.345732] mmc_rescan+0xd8/0x348
> [ 735.349132] process_one_work+0x1d4/0x374
> [ 735.353147] worker_thread+0x17c/0x4ec
> [ 735.356892] kthread+0x124/0x12c
> [ 735.360117] ret_from_fork+0x10/0x34
>
>
>
> I only could test this with 5.10.96 for now.
>
>
> --
> Regards
>
> Daniel Danzberger
> embeDD GmbH, Alter Postplatz 2, CH-6370 Stans