Re: structure dentry help...

Jamie Lokier (lkd@tantalophile.demon.co.uk)
Tue, 2 Nov 1999 15:46:59 +0100


Alexander Viro wrote:
> Whenever you think that you might need the inode soon (e.g. in
> ext2_readdir()) call ext2_want_inode(). Since we are talking about async
> requests we are not going to be blocked - they will be simply inserted
> into queue.

I don't think it can be done from readdir().

> Why do you want to bother with explicit submitting the list? Hint
> the fs that you will need that and let it decide what to do. In the most
> primitive variant readdir() may be considered as a hint. Or you can add a
> fcntl() for directories (I'm not sure that it's worth bothering - it's
> quite possible that we can always do it and win).

By far the greatest overhead is reading the inodes at all!

My program reads only about 5% (my guesstimate) of the inodes. The rest
are elided by the "leaf optimisation" (also in GNU find) or, in my
program, a heuristic-guided variant which finds equivalent solutions
with fewer inode reads.

So it is really bad to consider readdir() doing this at all the time,
and an fcntl() to turn it on would be equally useless for anything
except `ls -F' or `ls --colour'.

If the d_type option in ext2 was widely deployed, _that_ could be used
as a hint in this case. At the moment it is not, and until a new
getdents/getdirentries call as added there is no motivation to do so.
(It's not backward compatible with really old kernels).

[ btw, did you see my patch for adding d_type behaviour to the readdir()
callback for all filesystems? ]

> > I tried to simulate readahead/overlap in user space using threads, each
> > reading a sequential range of inodes pulled from a common list using
> > atomic operations. But it was a net loss in all cases but one, and in
> > that case the gain wasn't significant.
>
> Ouch. So you are creating a lot of threads (with all associated
> context switches, etc.) just to feed a list of async requests to device?
> Small wonder that you are getting a performance hit.

Even with raw clone(CLONE_VM) and just two threads the loss outweighs
the gain. The only case where I got a net gain was with five threads,
and then it was only 1% better -- well within measurement deviation.

I was quite surprised. There is not much context to switch when a call
blocks: the registers are saved on system call entry already. It's
possible that the newer or planned finer grained fs locks will make the
difference here. I have no plans to repeat the test for a while.

-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/