Re: [patch] bug fix in dio handling write error

From: Badari Pulavarty
Date: Tue Jan 17 2006 - 18:25:48 EST


On Tue, 2006-01-17 at 11:41 -0800, Chen, Kenneth W wrote:
> There is a bug in direct-io on propagating write error up to the
> higher I/O layer. When performing an async ODIRECT write to a
> block device, if a device error occurred (like media error or disk
> is pulled), the error code is only propagated from device driver
> to the DIO layer. The error code stops at finished_one_bio(). The
> aysnc write, however, is supposedly have a corresponding AIO event
> with appropriate return code (in this case -EIO). Application
> which waits on the async write event, will hang forever since such
> AIO event is lost forever (if such app did not use the timeout
> option in io_getevents call. Regardless, an AIO event is lost).
>
> The problem is that calls to aio_complete() is conditioned upon
> READ, but it should've handle WRITE error as well.
>
>
> Signed-off-by: Ken Chen <kenneth.w.chen@xxxxxxxxx>
>
>
> --- linux-2.6.15/fs/direct-io.c.orig 2006-01-17 11:54:17.010422462 -0800
> +++ linux-2.6.15/fs/direct-io.c 2006-01-17 12:08:00.444982688 -0800
> @@ -253,8 +253,7 @@ static void finished_one_bio(struct dio
> dio_complete(dio, offset, transferred);
>
> /* Complete AIO later if falling back to buffered i/o */
> - if (dio->result == dio->size ||
> - ((dio->rw == READ) && dio->result)) {
> + if (dio->result == dio->size || dio->result) {
> aio_complete(dio->iocb, transferred, 0);
> kfree(dio);
> return;
>
>

I vaguely remember adding the explicit "dio->rw == READ" check for a
reason (which escapes me right now). Suparna, do you remember ? Let me
think and get back to you.

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/