[patch] bug fix in dio handling write error

From: Chen, Kenneth W
Date: Tue Jan 17 2006 - 14:40:34 EST


There is a bug in direct-io on propagating write error up to the
higher I/O layer. When performing an async ODIRECT write to a
block device, if a device error occurred (like media error or disk
is pulled), the error code is only propagated from device driver
to the DIO layer. The error code stops at finished_one_bio(). The
aysnc write, however, is supposedly have a corresponding AIO event
with appropriate return code (in this case -EIO). Application
which waits on the async write event, will hang forever since such
AIO event is lost forever (if such app did not use the timeout
option in io_getevents call. Regardless, an AIO event is lost).

The problem is that calls to aio_complete() is conditioned upon
READ, but it should've handle WRITE error as well.


Signed-off-by: Ken Chen <kenneth.w.chen@xxxxxxxxx>


--- linux-2.6.15/fs/direct-io.c.orig 2006-01-17 11:54:17.010422462 -0800
+++ linux-2.6.15/fs/direct-io.c 2006-01-17 12:08:00.444982688 -0800
@@ -253,8 +253,7 @@ static void finished_one_bio(struct dio
dio_complete(dio, offset, transferred);

/* Complete AIO later if falling back to buffered i/o */
- if (dio->result == dio->size ||
- ((dio->rw == READ) && dio->result)) {
+ if (dio->result == dio->size || dio->result) {
aio_complete(dio->iocb, transferred, 0);
kfree(dio);
return;



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/