Re: kernel BUG at ide-cd.c:1726 in 2.6.24-03863-g0ba6c33 &&-g8561b089

From: Kiyoshi Ueda
Date: Wed Jan 30 2008 - 20:28:09 EST


Hi Roland, Borislav, Bart,

Added linux-ide ML, since we may be able to get helps from other
ide experts. This thread started from:
http://lkml.org/lkml/2008/1/29/140

On Tue, 29 Jan 2008 18:23:56 -0500 (EST), Kiyoshi Ueda wrote:
> Hi Bart,
>
> On Tue, 29 Jan 2008 14:22:53 -0800, Roland Dreier wrote:
> > Hi, I saw the same BUG from ide-cd on one of my systems. I applied
> > the debugging patch to replace the BUG with blk_dump_rq_flags(), and I
> > got the output below (full boot log and .config attached to this
> > email).
> >
> > Please let me know if there's anything else that would help debug the
> > problem.
>
> Thank you for the information, Roland.
>
>
> > [ 4.072271] Uniform CD-ROM driver Revision: 3.20
> > [ 4.098236] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [ 4.100269]
> > [ 4.100269] sector 1949759, nr/cnr 0/0
> > [ 4.100269] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d8
> > [ 4.100269] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [ 4.101005] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [ 4.104269]
> > [ 4.104269] sector 1949759, nr/cnr 0/0
> > [ 4.104269] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d2
> > [ 4.104269] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [ 4.109203] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [ 4.112270]
> > [ 4.112270] sector 1949759, nr/cnr 0/0
> > [ 4.112270] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d8
> > [ 4.112270] cdb: 12 01 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [ 4.112945] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [ 4.116270]
> > [ 4.116270] sector 1949759, nr/cnr 0/0
> > [ 4.116270] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d2
> > [ 4.116270] cdb: 12 01 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
>
> Bart,
> This means that the rq still has a bio even after DRQ_STAT is cleared.
> The original ide-cd code was calling only end_that_request_last() there.
> So I thought that the rq should have no bio when DRQ_STAT is cleared,
> otherwise the bio leaks.
>
> Was my understanding wrong and is that correct behavior in ide-cd?

I borrowed a box having the same nForce chipset and tried sg_inq
command to submit the GPCMD_INQUIRY ("cdb: 12" of the debug message).
I confirmed that ide-cd run through the code path (DRQ_STAT == 0)
by the same debug patch, but requests always don't have bio there
on my test box. So I can't reproduce the problem yet.
-----------------------------------------------------------------------
ide-cd: rq: dev hda: type=2, flags=114c8

sector 37958141, nr/cnr 0/0
bio 00000000, biotail f78e4980, buffer 00000000, data 00000000, len 0
cdb: 12 00 00 00 24 00 00 00 00 00 00 00 00 00 00 00
-----------------------------------------------------------------------


The original code was calling only end_that_request_last() here,
but no problem happened.
This may mean that the upper layer can handle the rq correctly,
no matter whether the rq still has a bio or not.
If so, we should be able to unlink the bio by calling
end_that_request_chunk() with remaining data size.



Roland,
Could you try the patch below and give me all boot messages again?

This patch displays debug messages against requests still having bio,
then tries to unlink all bios from the rq before the rq is completed.
So your system may be able to continue to work correctly
after displaying debug messages.
I'd like to see the debug messages and know whether your system
still gets the problem or not.

--- a/drivers/ide/ide-cd.c 2008-01-30 18:24:51.000000000 -0500
+++ b/drivers/ide/ide-cd.c 2008-01-30 18:24:33.000000000 -0500
@@ -1722,8 +1722,18 @@ static ide_startstop_t cdrom_newpc_intr(
*/
if ((stat & DRQ_STAT) == 0) {
spin_lock_irqsave(&ide_lock, flags);
- if (__blk_end_request(rq, 0, 0))
- BUG();
+ if (__blk_end_request(rq, 0, 0)) {
+ blk_dump_rq_flags(rq, "ide-cd: rq still having bio");
+ printk("backup: data_len=%u bi_size=%u\n",
+ rq->data_len, rq->bio->bi_size);
+
+ if (__blk_end_request(rq, 0, rq->data_len)) {
+ blk_dump_rq_flags(rq, "ide-cd: BAD rq");
+ printk("backup: data_len=%u bi_size=%u\n",
+ rq->data_len, rq->bio->bi_size);
+ BUG();
+ }
+ }
HWGROUP(drive)->rq = NULL;
spin_unlock_irqrestore(&ide_lock, flags);

Thanks,
Kiyoshi Ueda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/