Re: kernel BUG at ide-cd.c:1726 in 2.6.24-03863-g0ba6c33 &&-g8561b089

From: Kiyoshi Ueda
Date: Fri Feb 01 2008 - 14:39:22 EST


Hi Boris,

On Fri, 1 Feb 2008 19:29:09 +0100, Borislav Petkov wrote:
> > > > end_that_request_last() is not called when __blk_end_reuqest()
> > > > returns 1. Then, the issuer isn't waken up.
> > > > So I think the BUG() or error messages should be there.
> > >
> > > you mean, end_that_request_last() isn't called when __end_that_request_first()
> > > returns an error and this is the case only for fs and pc requests.
> > > Otherwise it _is_ called, thus simulating somewhat the previous behavior.
> > > However, we never BUG()'ged on residual byte counts before and
> > > this driver has been in the kernel tree for ages, so what puzzles
> > > me now is how is BUG()'ing here better than before and shouldn't we
> > > simply issue a warning instead of killing the interrupt handler...
> >
> > The Jens' patch passes the residual byte counts to __blk_end_request(),
> > so __end_that_reqeust_first() should never return 1 and we should never
> > BUG() on the residual byte counts, unless inconsistency happens such as
> > the size of remaining bios is bigger than the residual byte counts.
>
> yep.
>
> > So if __blk_end_request() returns 1 even with the Jens' patch,
> > it means that the block layer or the driver really have a bug.
> > And then, the request and the bios could leak or the issuer
> > would wait forever because end_that_request_last() isn't called.
> >
> > The previous behavior might ignore such inconsistency and leak only
> > the bios because it was calling end_that_request_last() anyway.
> > I would like to BUG() in such cases personally, but I don't object
> > strongly if you prefer not to BUG().
>
> BUG() is definitely what we should do here to catch this case of sizeof(bios) >
> rq->data_len. Putting a brown paper bag over the issue will never get it fixed
> if it really leaks bios. Thanks for clarifying that.
>
> By the way, shouldn't we be doing a little branch prediction here:
>
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 74c6087..bee05a3 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -1722,7 +1722,7 @@ static ide_startstop_t cdrom_newpc_intr(ide_drive_t *drive)
> */
> if ((stat & DRQ_STAT) == 0) {
> spin_lock_irqsave(&ide_lock, flags);
> - if (__blk_end_request(rq, 0, 0))
> + if (unlikely(__blk_end_request(rq, 0, rq->data_len)))
> BUG();
> HWGROUP(drive)->rq = NULL;
> spin_unlock_irqrestore(&ide_lock, flags);

I think it's reasonable.

Thanks,
Kiyoshi Ueda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/