Re: Is any file system on Linux appropriate for very large directories?

Kevin M Bealer (kmb203@psu.edu)
Thu, 18 Jul 1996 01:40:23 -0400 (EDT)


On Sat, 13 Jul 1996, Matthias Urlichs wrote:

>=20
> > that uses some kind of hashing for name lookup! A quick review of =
the
> > file systems currently available on Linux suggests that the only on=
e
> > that uses hashing is the Amiga file system. I don't mean to be
>=20
> A few others use B trees. The problem is that B trees are rather diff=
icult
> to write correctly (and doubly so in a multitasking environment); thi=
s is
> why HPFS is read-only and HFS is only somewhat-write. At the moment.
>=20
> --=20
> The reason the way of the transgressor is hard is because it's so
> crowded.
> -- Kin Hubbard
> --=20
> Matthias Urlichs \ noris network GmbH / Xlink-POP N=FCrnbe=
rg=20
> Schleiermacherstra=DFe 12 \ Linux+Internet / EMail: urlichs@n=
oris.de
> 90491 N=FCrnberg (Germany) \ Consulting+Programming+Networking+e=
tc'ing
> PGP: 1024/4F578875 1B 89 E2 1C 43 EA 80 44 15 D2 29 CF C6 C7 E0=
DE
> Click <A HREF=3D"http://info.noris.de/~smurf/finger">here</A>.=
42

I've always seen sorted lists as a binary tree with less overhead
(spacewise). I would say keep a sorted list and tack unsorted files on=
the
end when they are added. If you keep a count of sorted names and total
names, you can search binary-wise then linear search the rest... You co=
uld
update intelligently, for example whenever an entry is added and there =
is
either
A. more than (a)% of the entries unsorted, or
B. last sort was (b) days ago
C. last sort was (c * n) accesses ago, where n =3D=3D number of files.
D. number of hits on unsorted files > d=20
(good for netscape cache where most files are likely never seen=20
after write.)

__kmb203@psu.edu_________________________Debian__1.1___Linux__2.0.5___
Probable-Possible, my black hen,
She lays eggs in the Relative When.
She doesn't lay eggs in the Positive Now
Because she's unable to postulate how.
-- Frederick Winsor