Re: Raid not shutting down when disks are lost?

From: Dan Williams
Date: Sat Nov 21 2009 - 14:22:03 EST

Next message: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Previous message: indexer: "Re: Regression in efi.c 2.6.32-rc7"
In reply to: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Next in thread: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Nov 21, 2009 at 9:03 AM, Pierre Ossman <pierre-list@xxxxxxxxx> wrote:
> Neil?
>
> On Thu, 8 Oct 2009 16:39:52 +0200
> Pierre Ossman <pierre-list@xxxxxxxxx> wrote:
>
>> Today one RAID6 array I manage decided to lose four out of eight disks.
>> Oddly enough, the array did not shut down but instead I got
>> intermittent read and writer errors from the filesystem.

This is expected.

The array can't shutdown when there is a mounted filesystem. Reads
may still be serviced from the survivors, all writes should be aborted
with an error.

>>
>> It's been some time since I had a failure of this magnitude, but I seem
>> to recall that once the array lost too many disks, it would shut down
>> and refuse to write a single byte. The nice effect of this was that if
>> it was a temporary error, you could just reboot and the array would
>> start nicely (albeit in degraded mode).
>>
>> Has something changed? Is this perhaps an effect of using RAID6 (I used
>> to run RAID5 arrays)? Or was I simply lucky the previous instances I've
>> had?

It should not come back up nicely in this scenario. You need
"--force" to attempt to reassemble a failed array.

>>
>> Related, it would be nice if you could control how it handles lost
>> disks. E.g. I'd like it to go read-only when it goes in to fully
>> degraded mode. In case the last disk lost was only a temporary glitch,
>> the array could be made to recover without a lengthy resync.
>>

When you say "fully-degraded" do you mean "failed"? In general the
bitmap mechanism provides fast resync after temporary disk outages.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Previous message: indexer: "Re: Regression in efi.c 2.6.32-rc7"
In reply to: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Next in thread: Pierre Ossman: "Re: Raid not shutting down when disks are lost?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]