Re: bug ? raid5 inconsistency !

From: Jakob Østergaard (jakob@ostenfeld.dtu.dk)
Date: Mon Jul 24 2000 - 15:04:16 EST


On Mon, 24 Jul 2000, Willy Tarreau wrote:

> Hi all,
>
> this night, my NFS server partially stopped working. One of its soft raid5
> arrays gave many error messages when trying to access it and all involved
> processes stayed in D state. So I rebooted the server and the raid array
> refused to be set up because I have 3 different time stamps on my 3 disks !!!
>
> I don't know how this is possible, and I don't know what to do. I've tried
> to manually change a timestamp, but the raid superblock is after 2G and my
> dd can't seek to it (and I can't remember where I found one which uses
> llseek). So before destroying it totally, I'd like to know if there's a
> good method to recover from sort of these problems, because if not, I wonder
> if finally raid5 is such solid :-/
>
> for info, I'm using a 2.2.17pre11 kernel with ingo's raid-2.2.16, reiserfs
> 3.5.23 and other stuff so I admit this can come from my kernel, but I still
> don't understand how !

First you need to find out whether a drive has indeed failed, or if you were
seeing spurious controller/driver problems.

Try reading from each drive, eg. cat /dev/... > /dev/null. If you can
read all drives completely, they're probably fine. If one has failed, mark
down which one. If two or more drives did in fact fail, well then RAID-5 can't
save you.

If no drives failed: run mkraid -f - it will claim that it's about to
destroy all data, but don't worry, it will only re-write the superblocks and
rebuidl the parity. But this requires that your /etc/raidtab is 100%
consistent with your actual array configration.

If one drive failed: change the raidtab configuration so that the failed drive
is configured as a failed-disk instead of a raid-disk. Then run mkraid -f as
before.

Good luck ! :)

-- 
................................................................
: jakob@ostenfeld.dtu.dk  : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:17 EST