Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert

From: Neil Brown
Date: Tue Apr 17 2007 - 01:11:33 EST


On Monday April 16, cebbert@xxxxxxxxxx wrote:
>
> cfq_dispatch_insert() was called with rq == 0. This one is getting really
> annoying... and md is involved again (RAID0 this time.)

Yeah... weird.
RAID0 is so light-weight and so different from RAID1 or RAID5 that I
feel fairly safe concluding that the problem isn't in or near md.
But that doesn't help you.

This really feels like a locking problem.

The problem occurs when ->next_rq is NULL, but ->sort_list.rb_node is
not NULL. That happens plenty of times in the code (particularly as
the first request is inserted) but always under ->queue_lock so it
should never be visible to cfq_dispatch_insert..

Except that drivers/scsi/ide-scsi.c:idescsi_eh_reset calls
elv_next_request which could ultimately call __cfq_dispatch_requests
without taking ->queue_lock (that I can see). But you probably aren't
using ide-scsi (does anyone?).

Given that interrupts are always disabled when queue_lock is taken, it
might be useful to add
WARN_ON(!irqs_disabled());
every time ->next_rq is set.
Something like the following.

It might show something useful.... if we are lucky.

NeilBrown


diff .prev/block/cfq-iosched.c ./block/cfq-iosched.c
--- .prev/block/cfq-iosched.c 2007-04-17 15:01:36.000000000 +1000
+++ ./block/cfq-iosched.c 2007-04-17 15:02:25.000000000 +1000
@@ -628,6 +628,7 @@ static void cfq_remove_request(struct re
{
struct cfq_queue *cfqq = RQ_CFQQ(rq);

+ BUG_ON(!irqs_disabled());
if (cfqq->next_rq == rq)
cfqq->next_rq = cfq_find_next_rq(cfqq->cfqd, cfqq, rq);

@@ -1810,6 +1811,7 @@ cfq_rq_enqueued(struct cfq_data *cfqd, s
/*
* check if this request is a better next-serve candidate)) {
*/
+ BUG_ON(!irqs_disabled());
cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq);
BUG_ON(!cfqq->next_rq);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/