Re: Is any file system on Linux appropriate for very large directories?

Linus Torvalds (torvalds@cs.helsinki.fi)
Mon, 15 Jul 1996 15:05:18 +0300 (EET DST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: george harold byrkit: "Re: Sound Blaster II - SCSI"
Previous message: Michiel Boland: "Re: setsid() semantics changed..."

On Mon, 15 Jul 1996, Adam McKee wrote:
>
> Obviously, one needs to weigh the additional performance against the added
> complexity. It seems to me that a hashing directory lookup should provide
> a nice performance increase in exchange for relatively little added
> complexity.

The problem with hashed directories (or any other kind of "smart"
directories) is not the lookup phase: that's pretty simple, and is almost
always speeded up by the hashing (or the hashing is done badly).

The _real_ complexity comes in when you start adding/deleting entries.
You may get faster lookups, but you're almost sure to get slower
modifications, and what's worse - keeping track of the extra state is
making the whole thing more complex. That's ok if you're a database, but
I'm not certain it makes sense for the OS.

Note that Linux _does_ do hashing and cached lookups for directories,
but not "on-disk" - only for cached directory entries in memory. That
avoids the extra state on disk, and gives most of the performace, but it
obviously works only for cached names. And the name cache isn't very
large, so it does not work for the case where you have a large directory
and/or non-cache-friendly access ptterns.

That's why I think it generally makes sense to do this on a user level:
the application can do a much better analysis of the situation because
the person doing the application hopefully has a reasonable idea about
what the access patterns will be. In contrast, the kernel cannot make any
assumptions (other than very general ones) about access patterns, so the
kernel is better off doing the simple thing than doing something complex
that ends up not being a win for many things.

Linus

Next message: george harold byrkit: "Re: Sound Blaster II - SCSI"
Previous message: Michiel Boland: "Re: setsid() semantics changed..."