Re: I/O request ordering

Ray Van Tassle-CRV004 (
Fri, 9 Aug 1996 9:10:41 -0500

> From: on Fri, Aug 9, 1996 3:08 AM
> Subject: z-Re: I/O request ordering
> Old-Subject: Re: I/O request ordering
> Precedence: bulk
> Hi,
> [Long, but there's a lot to talk about...]
----- much of which is omitted here ----

> > Yes, I can't stress enough that this must be THROUGHLY tested and
> > benchmarked.
> Heheheh. You ain't kidding here.
> > My thoughts here are to do all these:
> ---- details deleted....
> I disagree. Maybe I'm just a pessimist, but I really think you need
> to implement it, throw it at a thousand odd beta testers for
> evaluation, put it into the current development kernel, play with it
> for a dozen releases while you get lots and lots of negative feedback,
I don't need anonymous beta testers to tell me I'm a f**ked up--I have two
teenagers! They tell me that all the time!
> tweak it all the time, then throw it out. :)
And get banished from kernel hacking?
"Flail away, and BEAT the software into submission!" Not if I can help it!
I think I'll do some preliminary testing before tossing it over the wall to
the waiting hordes.

> People HAVE tried to improve IO request scheduling in the past. The
> conclusions seem to be that yes, you can improve things at the device
> request layer IF you know exactly what the physical characteristics of
I really started this issue while scrutinizing ll_rw_blk.c and buffer.c, to
find and fix the deadlock with loop device. The code in add_request has a
"funny smell", and the comment said "elevator" but it didn't look like one.
Profiling showed add_request to be much higher than seemed reasonable. And
the current code seems to be overly complicated. And I think elevator would
be better than sawtooth.
Linus's concern was if this would hurt performance of clustered
reads--especially loading an executable. I don't *think* it would, and your
comments seem to support that. But I also don't *know* what happens when an
executable gets loaded. This clearly calls for coding it up and testing it,
to see what effect (if any) a different algorithm has.
Just on the face, my new inner loop has 2 (sometimes 3) tests, and the
existing code has 9.
And sometimes (in fact--_many_ times), simple things, like ordering or
scanning a list differently, or unrolling a loop, can give sunstantial

> the disk --- cylinder format, seek timings, command setup latencies
> and rotation speed --- are. Otherwise, you are much better off trying
Yes, it's easy to loose perspective, and spend a dollar to save a dime (or
"spend a pound to save a shilling" .......... or something)

> to optimise the parts of the system which are generating the requests
> in the first place, by introducing techniques such as the
> log-structured filesystem (but that's another can of worms), adaptive
> readahead and by optimising the requesting of buffer writes in the
> first place.
As you said, buffer writes are largely a non-issue. Most of it is done by
bdflush, and timeliness isn't very important.
Adaptive readahead sounds interesting. And HARD.

> Cheers,
> Stephen.
> --
> Stephen Tweedie <>
> Department of Computer Science, Edinburgh University, Scotland.
As soon as get the loop deadlock fixes off to Linus, and release the loop
driver optimizations, and figure out (or get somebody else to figure out)
how to get Linux to boot from a loop device, and install the AMD 586-133,
and play some QUAKE.......oh, and convince friend wife that I don't either
love the computer better than her..... After all this, I'll delve into
ordering I/O requests better.

> 1] Chris Ruemmler and John Wilkes, "UNIX Disk Access Pattern". HP
> Laboratories Technical Report #HPL-92-152, December 1992. Published
> in Proceedings of the Usenix Winter 1993 Technical Conference.
> 2] Margo Seltzer, Peter Chen and John Ousterhout, "Disk Scheduling
> Revisited". Computer Science Division, UCB.
I'll look these up.