Re: Block driver freezes when using CFQ

From: Tejun Heo
Date: Mon Oct 30 2006 - 01:23:39 EST


Hello, Ravi.

First of all, it usually attracts more people if you include full source code of something runnable.

Ravi Krishnamurthy wrote:
[--snip--]
Several times during the test run, the while() loop in the request
function comes out without dequeuing any request even though the
elevator queue is not empty. (Confirmed by printing the return value of
elv_queue_empty(), and the values of q->rq.count[] outside the loop).

Yeap, both cfq and anticipatory pause queue processing to improve performance. This is a bit counter-intuitive at first but the loooong seek time justifies such pauses if they can reduce seeks.

After one such occurrence, the request function is not called at all
and the device becomes unresponsive.
I added some code that lets me trigger the request function from userspace.
If I nudge the driver this way, I/Os continue for a short while and stop
again.

Since CFQ is the default I/O scheduler in current kernels, it has been
widely used and tested. So I suspect I am not doing something right in my
driver. Since the driver works well with the other schedulers, is there
something CFQ-specific that I should take care of?

After such pauses, cfq does the needed 'nudging' by itself. cfq has changed quite a bit so I might be mistaken but such 'nudging' ends up calling blk_start_queueing() which either runs request_fn directly or unplug the queue if plugged. So, does your driver's queue have proper unplug function? How is your queue initialized? (you can see why it's much better to post full working source.)

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/