Re: Advanced storage management ( suggestion )

From: Mike Fedyk
Date: Sat Mar 06 2004 - 22:37:38 EST


Ramy M. Hassan wrote:
1- Very fast block allocation ( Using balanced trees for tracking free
blocks comes into my mind now, but I still it is early to decide the
design ).

You most likely want to use extents in addition to whatever else you use (ie, trees/etc.).

2- Support for multi-disk/multi-host storage pool.

You're mixing layers here. MD and DM already work in this area.

3- Meta data storage and block storage can be isolated for better
performance.

There is support for journal on a different device in the generic JBD code that ext3 uses, and reiserfs (possibly others also). That may be a good place to work from.

Are there any examples of this in other OSes? If you put the meta-data on a separate drive, it would be an inherently seeky load. How does this compare to putting raid below the mixed data and meta-data block device?

4- Meta data and block replication options.

Coda and Intermezzo do this in a filesystem independent way already. This can add flexibility.

5- Transactional options for journaling filesystems or transactional
databases.

Isn't journaling inherently transaction based already?

6- Supports clustering through lock managers where multiple hosts can
read/write to same storage devices concurrently ( suitable for SANs )

This is going to be a very heavy layer, and few people will use it if it isn't very light (or can be configured that way)

7- Transparent recovery from corruption or hardware failure.

Journaling in ext3 is block based, and the rest are virtual (descriptions of the actions are in the journal, not the entire block of meta-data -- when you're not running in data journaling mode).

How do you plan on integrating your proposed layer with these two very different approaches to filesystem journaling?

8- Direct access from userland ( for DBMS, LDAP, and other userland
applications ).

You have a separate userspace, and kernel implementation right?

9- Plugins support ( like those of reiserfs 4).

This can be good or bad. Make sure it doesn't bloat your layer.

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/