Re: armmmci rmmod causes hung tasks

From: Martin DEVERA
Date: Mon Nov 30 2020 - 12:21:40 EST


On 11/30/20 4:08 PM, Ulf Hansson wrote:
On Sun, 29 Nov 2020 at 19:20, Martin DEVERA <devik@xxxxxxxxxx> wrote:
Hello,

on STM32MP1 with almost vanilla 5.7.7 in single CPU mode. Pair of
modprobe armmmci ; rmmod armmmci

causes rmmod and kworker to hang. I should note that no MMC is detected
on the board (SDIO device on MMC bus is not responding).
On another board (where SDIO is responding) rmmod works.

It seems as another manifestation of https://lkml.org/lkml/2019/8/27/945

Thanks.

INFO: task kworker/0:1:12 blocked for more than 368 seconds.
Not tainted 5.7.7kdb-00003-g10397828596c-dirty #224
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D 0 12 2 0x00000000
Workqueue: events_freezable mmc_rescan
(__schedule) from (schedule+0x5b/0x90)
(schedule) from (schedule_timeout+0x1b/0xa0)
(schedule_timeout) from (__wait_for_common+0x7d/0xdc)
(__wait_for_common) from (mmc_wait_for_req_done+0x1b/0x8c)
(mmc_wait_for_req_done) from (mmc_wait_for_cmd+0x4d/0x68)
(mmc_wait_for_cmd) from (mmc_io_rw_direct_host+0x87/0xc8)
(mmc_io_rw_direct_host) from (sdio_reset+0x3b/0x58)
(sdio_reset) from (mmc_rescan+0x15d/0x1d4)
(mmc_rescan) from (process_one_work+0xdd/0x168)
(process_one_work) from (worker_thread+0x17d/0x1ec)
(worker_thread) from (kthread+0x9b/0xa4)
(kthread) from (ret_from_fork+0x11/0x28)
It looks like the worker thread, which runs mmc_rescan() to try to
detect the SDIO card is hanging. Exactly why, I don't know.

Could be a misconfigured clock, pinctrl or a power domain being
suddenly gated...

I turned some logging on (see below), IIUC pl18x is starting CMD52 with arg
SDIO_CCCR_ABORT read and it got IRQ later along with response. Then sending
again SDIO_CCCR_ABORT write but no IRQ comes back.

[  135.810802] mmc0: mmc_rescan_try_freq: trying to init card at 400000 Hz
[  135.810832] mmc0: starting CMD52 arg 00000c00 flags 00000195
[  135.810862] mmci-pl18x 48004000.sdmmc: op 34 arg 00000c00 flags 00000195
[  135.811155] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000040
[  135.811178] mmc0: req done (CMD52): 0: 00000000 00000000 00000000 00000000
[  135.811202] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000000
[  135.816487] mmc0: starting CMD52 arg 80000c08 flags 00000195
[  135.816506] mmci-pl18x 48004000.sdmmc: op 34 arg 80000c08 flags 00000195
[  172.150614] random: crng init done
[  172.150642] random: 6 urandom warning(s) missed due to ratelimiting
[  173.290565] INFO: task kworker/0:0:5 blocked for more than 20 seconds.

Here is the same system, only with different (working) SDIO device on the same bus:

[  495.654596] mmc0: mmc_rescan_try_freq: trying to init card at 400000 Hz
[  495.654628] mmc0: starting CMD52 arg 00000c00 flags 00000195
[  495.654658] mmci-pl18x 48004000.sdmmc: op 34 arg 00000c00 flags 00000195
[  495.654996] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000004
[  495.655017] mmc0: req done (CMD52): -110: 00000000 00000000 00000000 00000000
[  495.655042] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000000
[  495.660201] mmc0: starting CMD52 arg 80000c08 flags 00000195
[  495.660222] mmci-pl18x 48004000.sdmmc: op 34 arg 80000c08 flags 00000195
[  495.660549] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000004
[  495.660567] mmc0: req done (CMD52): -110: 00000000 00000000 00000000 00000000
[  495.660591] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 00000000

Should it be expected, that invalid (probably non-responding) device on the SDIO bus
causes it to be locked up forever ?
Or is it bug in pl18x driver not handling the error correctly ?

best regards,
Martin