Bug in disk event polling
From: Alan Stern
Date: Fri Feb 10 2012 - 15:31:20 EST
Tejun:
Don't ask me why this hasn't shown up earlier... There's a big fat bug
in the implementation of disk event polling.
The polling is done using the system_nrt_wq work queue, which isn't
freezable. As a result, polling continues while the system is
preparing for suspend or hibernation.
Obviously I/O to suspended devices doesn't work well. Somewhat less
obviously, error recovery for the failed I/O attempts can interfere
with normal system resume.
You can see this for yourself easily enough by suspending or
hibernating while a USB flash drive is plugged in. You don't even need
to go through the full suspend procedure; the first two stages are
enough (echo devices >/sys/power/pm_test). Check the system log
afterward; most likely you'll find the flash drive got errors and had
to be unregistered and re-enumerated.
I have verified that changing all occurrences of system_nrt_wq in
block/genhd.c to system_freezable_wq fixes the bug. However this may
not be the way you want to solve it; you may prefer to have a freezable
non-reentrant work queue.
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/