Crazy modified HSM idea (was Re: Is ReiserFS really a journaling...)

From: Michael Gerdts (gerdts@cae.wisc.edu)
Date: Thu Mar 30 2000 - 11:36:16 EST

Next message: Richard Gooch: "Re: [OT]state space logic execution speed"
Previous message: kuznet@ms2.inr.ac.ru: "Re: iproute and 2.3 question"
In reply to: Andrea Arcangeli: "Re: Is ReiserFS really a journaling file system, or is it really just a synchronous-metadata file system like BSD FFS?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Mar 30, 2000 at 05:53:24PM +0200, Andrea Arcangeli wrote:
> You can as well write a journaling data+metadata filesystem that does zero
> copy if you accept file fragmentation on file changes. But if you need
> zero copy at that cost then you'd better use logging filesystem that would
> have less inode block allocation overhead (immediate function ;) and that
> has the same fragmentation trouble (read-I/O potentially slower due
> seeks).
>
> Andrea

In most circumstances that I have run into, there is a time when the file
system (and the rest of the machine) is not terribly busy. Perhaps a
background defragmentation mechanism would be worthwhile in a case like
this.

This kinda relates to another crazy idea that I had. Essentially it is a
modified HSM using all disk, rather than disk and tape.

One of my Solaris NFS servers has about 120 gig of space in a RAID 1+0.
Since this represents about 4000 people's home directories, and there are
at most about 200 people using the space at any time, it doesn't make much
sense to have all of that data on the fastest drives. Furthermore, much of
the data is old assignments, images, etc. that never get written, but get
read once in a while.

Currently all of the home directory space that I serve is in a RAID 1+0. I
have stacks of old 9-gig drives that work fine for storing data, but are
just too slow for standard use. BUT, if the file system could communicate
with the block device to figure out which part of the block device is fast
and which part is slow, it could migrate the heavily used data to the
fastest part and let the dead stuff sit on the old drives. Ideally it
would have separate notions for the speed of the drives for reads and
writes... I wouldn't want an occassional grep through a mailbox to cause
blocks to get moved from a RAID5 to a RAID10. Also, it would be nice to
provide ioctl's to ensure that the operation does not cause migration of
the file. For instance, you probably don't want backups to adjust cause
data to be shifted around.

In other words, with this scheme you would constantly (well, hopefully not
constantly) be using the extra CPU cycles, and unused disk cycles to
optimize the file system in a rather complex way.

Has something like this been discussed before?

Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Richard Gooch: "Re: [OT]state space logic execution speed"
Previous message: kuznet@ms2.inr.ac.ru: "Re: iproute and 2.3 question"
In reply to: Andrea Arcangeli: "Re: Is ReiserFS really a journaling file system, or is it really just a synchronous-metadata file system like BSD FFS?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:27 EST