Re: Is any file system on Linux appropriate for very large directories?

Matthias Urlichs (
Thu, 25 Jul 1996 17:00:12 +0100

In, article <>,
"Michael J. Micek" <> writes:
> =20
> Copious information about skiplists is to be found at
There's a big problem with skip lists -- they distribute all their data=
over the place. B*-trees have much better locality.

Using a skip list, you have to follow the chain of entries on each leve=
l to
find out whether you need to continue, or whether you can proceed to th=
next level. Each follow-the-link, including the last (where you read da=
you can't really use), is one disk read because the entries are rather
unlikely to be colocated.

With a B* tree, you need to do exactly one read operation per level, an=
the higher levels are likely to be in cache anyway.

My conclusion, therefore, is that a skip list is a rather stupid idea w=
you try to apply it to the directory of an on-disk file system.

And, while I agree with Linus that you should probably use a multilevel
scheme, sometimes things do go wrong. A cron job that doesn't clean up
after itself will leave 30000 temporary files behind when the system is=
a few weeks. The mailer stores its temporary files in a flat directory.=
news system stores all articles in one directory (that one should chang=
e in
the near future, but it's that way now). Anyway, mistakes happen. It's =
expensive in CPU time to clean up a directory that's grown to beyond 50=
k or
so, "ls" is _slow_ even if it's not sorting, and these beasts don't eve=
shrink their size back once you have cleaned them...

All these problems could be avoided with things like B*-tree directorie=
Creating an alternate version of ext2, which does everything just like =
"real" ext2 file system but which uses B*-trees instead of linear lists=
its directory storage (or even, in addition to -- for new directories, =
you tell it to with a mount option), shouldn't be _too_ difficult to cr=
-- I'd be very interested how much the performance difference on our N=
server would be...

Write home; they miss you.
Matthias Urlichs \ noris network GmbH / Xlink-POP N=FCrnberg=
Schleiermacherstra=DFe 12 \ Linux+Internet / EMail: urlichs@nor=
90491 N=FCrnberg (Germany) \ Consulting+Programming+Networking+etc=
PGP: 1024/4F578875 1B 89 E2 1C 43 EA 80 44 15 D2 29 CF C6 C7 E0 D=
Click <A HREF=3D"">here</A>. =