Re: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

From: Jens Axboe
Date: Tue Jan 29 2008 - 14:07:56 EST


On Tue, Jan 29 2008, Miller, Mike (OS Dev) wrote:
> Jens wrote:
>
> > -----Original Message-----
> > From: Jens Axboe [mailto:jens.axboe@xxxxxxxxxx]
> > Sent: Tuesday, January 29, 2008 12:54 PM
> > To: Andrew Vasquez
> > Cc: Linux Kernel Mailing List; Miller, Mike (OS Dev);
> > k-ueda@xxxxxxxxxxxxx; j-nomura@xxxxxxxxxxxxx
> > Subject: Re: kernel BUG at drivers/block/cciss.c:1260! (with
> > recent linux-2.6 tree)
> >
> > On Tue, Jan 29 2008, Andrew Vasquez wrote:
> > > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > >
> > > > On Tue, Jan 29 2008, Andrew Vasquez wrote:
> > > > > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > > > >
> > > > > > > Here the final snippet that was logged:
> > > > > > >
> > > > > > > [ 12.724997] input: USB HID v1.01 Mouse [HP
> > Virtual Keyboard] on usb-0000:01:04.4-1
> > > > > > > [ 12.728971] usbcore: registered new interface
> > driver usbhid
> > > > > > > [ 12.732866] drivers/hid/usbhid/hid-core.c:
> > v2.6:USB HID core driver
> > > > > > > [ 12.741172] TCP cubic registered
> > > > > > > [ 12.744506] NET: Registered protocol family 1
> > > > > > > [ 12.744884] NET: Registered protocol family 17
> > > > > > > [ 12.749217] Freeing unused kernel memory: 228k freed
> > > > > > > [ 12.885823] cciss rq: dev cciss/c0d0: type=2, flags=104c8
> > > > > > > [ 12.888929]
> > > > > > > [ 12.888930] sector 6510615555426900570, nr/cnr 0/0
> > > > > > > [ 12.892895] bio ffff81042f130730, biotail
> > ffff81042f130730, buffer 0000000000000000, data
> > 0000000000000000, len 0
> > > > > > > [ 12.896895] cdb: 12 00 00 00 fe 00 00 00 00 00
> > 00 00 00 00 00 00
> > > > > >
> > > > > > Ah ok, I see the problem... cciss is overriding the
> > data_len for
> > > > > > BLOCK_PC requests, hence it does not complete them properly.
> > > > > > Hmm. Does this work?
> > > > > >
> > > > > > diff --git a/drivers/block/cciss.c
> > b/drivers/block/cciss.c index
> > > > > > ef50068..b6fa52e 100644
> > > > > > --- a/drivers/block/cciss.c
> > > > > > +++ b/drivers/block/cciss.c
> > > > > > @@ -2524,7 +2524,6 @@ after_error_processing:
> > > > > > resend_cciss_cmd(h, cmd);
> > > > > > return;
> > > > > > }
> > > > > > - cmd->rq->data_len = 0;
> > > > > > cmd->rq->completion_data = cmd;
> > > > > > blk_complete_request(cmd->rq); }
> > > > >
> > > > >
> > > > > Things look good so far -- with the patch above I can
> > finally boot
> > > > > the machine.
> > > >
> > > > Cool, sorry about that. Will get that applied asap. So after this
> > > > patch was applied, you didn't see any debug messages from
> > > > blk_dump_rq_flags() anymore, right?
> > >
> > > That's correct. I've yet to see any additional debug-messages from
> > > blk_dump_rq_flags().
> >
> > Great, thanks for confirming. It does look like a clear bug
> > in cciss, it just got exposed now that it uses proper end
> > request handling. We never need to clear ->data_len, since
> > for blk_fs_request() it will be cleared on init. So just
> > setting a residual count there for blk_fs_request() like
> > cciss does is fine.
>
> Just so I'm clear: just removing the one line is enough to resolve the
> problem?

Yep, that's it. The line should never have been there (it accomplishes
nothing for the old end io handling) and with the newer end io handling
it breaks when it uses the block completion functions instead of the
home grown in cciss.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/