Multi-partition block layer behaviour

From: Tiju Jacob
Date: Wed Oct 26 2011 - 01:16:36 EST


Hi All,

We are trying to run fsstress tests on ext4 filesystem with
linux-3.0.4 on nand flash with our proprietary driver. The test runs
successfully when run on single partition but fails when run on
multiple partitions with the bug "BUG: scheduling while atomic:
fsstress.fork_n/498/0x00000002".

Analysis:

1. When an I/O request is made to the filesystem, process 'A' acquires
a mutex FS lock and a mutex block driver lock.

2. Process 'B' tries to acquire the mutex FS lock, which is not
available. Hence, it goes to sleep. Due to the new plugging mechanism,
before going to sleep, shcedule() is invoked which disables preemption
and the context becomes atomic. In schedule(), the newly added
blk_flush_plug_list() is invoked which unplugs the block driver.

3) During unplug operation the block driver tries to acquire the mutex
lock which fails, because the lock was held by process 'A'. Previous
invocation of scheudle() in step 2 has already made the context as
atomic, hence the error "Schedule while atomic" occured.


Please recommend us on how to handle this situation.


Thanks in advance.
--TJ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/