Re: [PATCH] i2o_block Fix, possible CFQ elevator problem?

From: Jens Axboe
Date: Tue Apr 20 2004 - 06:40:11 EST


On Tue, Apr 20 2004, Jens Axboe wrote:
> On Tue, Apr 20 2004, Warren Togami wrote:
> > Jens Axboe wrote:
> > >>>
> > >>>Repeat the tests that made it crash. The last patch I sent should work
> > >>>for you, at least until the real issue is found.
> > >>>
> > >>
> > >>Tested your patch, it indeed does seem to keep the system stable. If I
> > >>am understanding it right, the patch disables merging in the case where
> > >>it would have caused a BUG condition? (Less efficiency.)
> > >
> > >
> >
> > Bad news... much later during the test the system locked up. During
> > this test we did not use "sync" but just let all four bonnie++'s run.
> >
> > http://togami.com/~warren/archive/2004/i2o_cfq_quad_bonnie3.txt
> > ----------- [cut here ] --------- [please bite here ] ---------
> > Kernel BUG at cfq_iosched:404
> > invalid operand: 0000 [1] SMP
>
> Sorry about that, that's actually expected when we know this bug
> exists. You need to move the cfq_remove_merge_hints(q, crq) before the
> BUG_ON(q->last_merge == rq) check, or (better) just remove it
> completely. There's no way that q->last_merge could be set to this
> request after cfq_remove_merge_hints() was called.

In short, this patch. I can see this happening for an aliased request,
but you should not be hitting that with bonnie (you are not doing any
form of raw or O_DIRECT io, are you?).

===== drivers/block/cfq-iosched.c 1.1 vs edited =====
--- 1.1/drivers/block/cfq-iosched.c Mon Apr 12 19:55:20 2004
+++ edited/drivers/block/cfq-iosched.c Tue Apr 20 13:37:33 2004
@@ -401,10 +401,9 @@
dispatch:
rq = list_entry_rq(cfqd->dispatch->next);

- BUG_ON(q->last_merge == rq);
crq = RQ_DATA(rq);
if (crq)
- BUG_ON(ON_MHASH(crq));
+ cfq_remove_merge_hints(q, crq);

return rq;
}

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/