Re: problems with large directories - not only ext2

Robert G. Brown (rgb@phy.duke.edu)
Thu, 12 Aug 1999 09:53:03 -0400 (EDT)


On Mon, 9 Aug 1999, Hermann Schichl wrote:

> I'd guess that this is for two reasons:
> a) the -l flag forces ls to stat every single file
> b) not having turned off sorting forces ls to sort the contents of
> the directories in alphabetical order, which might cause creation
> of intermediate temporary files or swapping if size is very big
> (a -R does not improve this situation).
> This is not a situation of Linux (IMHO) but one of UNIX. Most Unices
> show this behaviour. Probably if you try ls -fR, times should go linearly
> with directory size.
>
> But maybe I'm wrong since I did not do too much of testing.

No, I think you're right. You also see this sort of slowing down when
writing e.g. an application that that opens, reads (for processing) and
closes every file in a directory; when there are over "many" (1000's or
so) files in the directory, there is a very nasty, somewhat nonlinear
slowdown. I've experienced it under several Unices and have had to
rewrite the way I did certain things to avoid the penalty.

It is important not to underestimate the burden of overhead associated
with file stat-ing and opening, both in the OS and on e.g. the disk
itself (where seeks are much more expensive than sequential reads and
where stating a few thousand files means MINIMALLY a few thousand seeks
at 5-10 msec/each -- definitely macroscopic time). Note that this can
easily take longer than the actual reading of the file(s). This is one
of several reasons that the tendency of modern Unices (and linux
distributions) to pack "everything" into a single directory (e.g.
/usr/info) worries me. With a few hundred files, performance is ok, but
if that ever scales up to a few thousand (as it might if everything is
truly fully documented and dumped in that one directory) then
performance is going to start to lag a bit.

rgb

Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/