[patch, rfc] do_generic_file_read: clear page errors when issuing a fresh read of the page

From: Jeff Moyer
Date: Fri Jan 23 2009 - 14:28:00 EST


Hi,

In testing older kernels, I've seen I/O errors set PG_error for the page
cache page (for a failure of the last path in a mulitpath setup). Then,
when the device is back online, the page error is never cleared, so all
reads from the previously bad disk location fail. You can actually get
around this by simply mmap()ing the file (the device node, in my case)
and accessing the page that way, as filemap_fault will clear the page
error bit.

So, hopefully this patch makes sense. As implied above, I haven't
verified empirically that this is the case on newer kernels (though,
from code inspection, it looks to still be a problem). I'll try to
reproduce the problem, but just in case I can't, I wanted to get this
rfc out there.

Cheers,
Jeff

diff --git a/mm/filemap.c b/mm/filemap.c
index 0e77f4a..3adafc8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1139,6 +1139,12 @@ page_not_up_to_date_locked:
}

readpage:
+ /*
+ * If there was an I/O error flagged previously, clear it
+ * here. PG_error will be set again if the readpage fails.
+ */
+ ClearPageError(page);
+
/* Start the actual read. The read will unlock the page. */
error = mapping->a_ops->readpage(filp, page);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/