Re: FYI: RAID5 unusably unstable through 2.6.14

From: Mark Lord
Date: Thu Jan 19 2006 - 10:52:32 EST

Next message: Carsten Otto: "Re: Kernel BUG at include/linux/gfp.h:80"
Previous message: Jeff Mahoney: "Re: 2.6.16-rc1 + reiser* from -rc1-mm1 : BUG with reiserfs"
In reply to: Neil Brown: "Re: FYI: RAID5 unusably unstable through 2.6.14"
Next in thread: Neil Brown: "Re: FYI: RAID5 unusably unstable through 2.6.14"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Neil Brown wrote:

Very recent 2.6 kernels do exactly this. They don't drop a drive on a
read error, only on a write error. On a read error they generate the
data from elsewhere and schedule a write, then a re-read.

Well done, then. Further to this:

Pardon me for not looking at the specifics of the code here,
but experience shows that rewriting just the single sector
is often not enough to repair an error. The drive often just
continues to fail when only the bad sector is rewritten by itself.

Dumb drives, or what, I don't know, but they seem to respond
better when the entire physical track is rewritten.

Since we rarely know what a physical track is these days,
this often boils down to simply rewriting a 64KB chunk
centered on the failed sector. So far, this strategy has
always worked for me.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Carsten Otto: "Re: Kernel BUG at include/linux/gfp.h:80"
Previous message: Jeff Mahoney: "Re: 2.6.16-rc1 + reiser* from -rc1-mm1 : BUG with reiserfs"
In reply to: Neil Brown: "Re: FYI: RAID5 unusably unstable through 2.6.14"
Next in thread: Neil Brown: "Re: FYI: RAID5 unusably unstable through 2.6.14"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]