Re: Heuristic readahead for filesystems

From: jw schultz (jw@pegasys.ws)
Date: Thu Sep 12 2002 - 08:35:47 EST


On Thu, Sep 12, 2002 at 07:41:09AM -0400, Richard B. Johnson wrote:
> On Wed, 11 Sep 2002, jw schultz wrote:
>
> > On Wed, Sep 11, 2002 at 03:21:37PM -0400, Richard B. Johnson wrote:
> > > On Wed, 11 Sep 2002, Oliver Neukum wrote:
> > > > No, it won't. But it would solve the issue of reading ahead.
> > > > Stating needs a kernel implementation of 'stat ahead'
> > > > -
> > >
> > > I think this is discussed in the future. Write-ahead is the
> > > next problem solved. ?;)
> >
> > Gating back to the original issue which was "readahead" of
> > stat() info...
> >
> > The userland open of a directory could trigger an advance
> > reading of the directory data and of the inode structs of
> > all it's immediate members. Almost all instances of a
> > usermode open on a directory will be doing fstats. Even a
> > command line ls often has options (colour, -F, etc) turned on
> > by default that require fstat on all the entries.
> > The question would be how far ahead of the user app would
> > the kernel be.
> >
> > I could possibly see having a fcntl() for directories to
> > pre-read just the first block of each file to accelerate
> > file-managers that use magic and perhaps forestall readahead
> > pulling in more than magic will use.
>
> Then you are tuning a file-system for a single program
> like `ls`. Most real-world I/O to file-systems are not done
> by `ls` or even `make`. The extra read-ahead overhead is

Most real-world filesystem I/O doesn't open(2) directories.
Most filesystem I/O is stat, open and unlink of files with
full paths, no open(2) of the directory. Notice that i
refer to the system call not internal functions that path
lookup invoke. The list of things that open(2) directories
is very short and almost all of them stat the the majority
of the directory's contents.

> just that, 'overhead'. Since the cost of disk I/O is expensive,
> you certainly do not want to read any more than is absolutely
> necessary. There had been a lot so studies about this in the
> 70's when disks were very, very, slow. The disk-to-RAM speed
> ratio hasn't changed much even though both are much faster.
> Therefore, the conclusions of these studies, made by persons
> from DEC and IBM, should not have changed. From what I recall,
> all studies showed that read-ahead always reduced performance,
> but keeping what was already read in RAM always increased
> performance.

I'm sure there will be others that can show you the numbers.
Things have changed since those studies. Disks are still
slooooooooow. However the OS doesn't have to nursemaid the
transfer. In most cases we queue requests and receive an
interrupt when the data is in memory. _IF_ the disk
isn't otherwise kept busy readahead reduces latency.
Most of the associated blocks have near proximity so there
is a good chance to do the readahead in a minumum number of
requests if they are fed to the queues in a batch.

> > The question would be how far ahead of the user app would
> > the kernel be.
I repeat this because i think it is a central point. Much
of the I/O associated with directory scanning is in tight
loops that would mirror the kernel's behavior. I have
doubts that it would produce a performance boost because it
might just be a synchronized duplication of effort.

I am not advocating doing this. I was just exploring the
idea and bringing the thread back to the original question.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

Remember Cernan and Schmitt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:29 EST