Re: Are linux-fs's drive-fault-tolerant by concept?

From: Dr. David Alan Gilbert (gilbertd@treblig.org)
Date: Sat Apr 19 2003 - 13:41:20 EST


Hi,
  Besides the problem that most drive manufacturers now seem to use
cheese as the data storage surface, I think there are some other
problems:

  1) I don't trust drive firmware.
        2) I don't think all drives are set to remap sectors by default.
        3) I don't believe that all drivers recover neatly from a drive error.
        4) It is OK saying return the drive and get a new one - but many of
           us can't do this in a commercial environment where the contents of
                 the drive are confidential - leading to stacks of dead drives
                 (often many inside their now short warranty periods).
                 
To be fair I'm not sure if it is only the drive firmware I don't trust -
it could be the controllers and the IDE drivers as well - I don't know.

While RAID works well for drives that just go pop and die, for drives
with dodgy firmware we just sit there and watch the filesystems decay.
I don't think the kernel can do much about that - but it is a sad state.

I'd find two things useful in this respect:
  1) A tool to check the consistency of a RAID - presuming I shut my
        RAID down safely I should actually be able to use the redundant
        information to test it; this should reveal corruption early.
        (Perhaps the kernel could check a few sectors a second in the
        background)

        2) A disc exerciser - something that I can use to see if this drive,
        connected to this controller, on this motherboard on this kernel
        actually works and keeps its data safe before I put it into live
        service.

Dave (After a few weeks of fighting pissy IDE hard drives)

 ---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _________________________|_____ http://www.treblig.org |_______/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 23 2003 - 22:00:26 EST