Re: elevator messages in 2.3.50

From: Andrea Arcangeli (andrea@suse.de)
Date: Thu Mar 09 2000 - 12:05:56 EST


On Thu, 9 Mar 2000, Jamie Lokier wrote:

>Andrea Arcangeli wrote:
>> [..] Any kind of guess will lose and will harm
>> the fast path.
>
>There should be a question: does the guess result in better overall I/O
>performance?

No if you don't know there will be another near request as soon as the
current one gets completed.

>[..] Waiting for a new request in the holdoff time only wins
>if you get one and the elevator chooses it in preference to the request
>that was waiting. [..]

Right (or more carefully: _possible_). For example that's obviously
wrong if the long seek moves in another disk (raid hardware for example)
and it may be wrong for some hardware but as said I was just supposing
to learn the holdoff time with lowlevel blockdevice interaction.

> [..] And that may only happen in the same cases that
>trigger readahead/readaround anyway.

As just said readahead(read(2))/readaround(pageins) by design doesn't need
such logic at all. They instead invalidate the benefit of that heuristic.

The heuristic make sense only where we can give an hint and where we can't
do readahead/readaround and that's the case of the indirect blocks. And
even the indirect blocks case is an issue only during startup in only
patological cases, then - once the indirect blocks are in cache - that's not
an issue anymore and so it doesn't even look like a real issue to me if you
have enough memory to take the very-hot informations in cache.

>> I remember you said the larger the seek is, the more probability another
>> request will happen in the middle. Ok. But the whole point is that you
>> _can't_ know if there will be another request any time soon unless the fs
>> tells you about that. The FS is the only entity that can know that
>> _sometime_.
>
>Well _my_ point (;-) is that especially for page-ins, _nobody_ knows if
>there will be another request soon. And that situation has another

The pagein has to discover where the binary data block is. To do that you
may have to follow indirect blocks. So really the heuristic can fit inside
pageins too.

For the binary _data_, the readaround takes care of that, and the indirect
blocks are going to be read only at once anyway as said above (and really
binaries are usually too short to need to fault on lots of level of
indirections so the heuristic is not going to help pageins at all).

>> It's related IMHO because that's nearly the only place where we can't do
>> readahead and where we do "near" sync requests at regular intervals
>> (read, wait, read next, wait, read next).
>
>No it's not related because that can be solved by fixing the fs :-)

That's my whole point. If you fix the fs the heuristic is not necessary
anymore.

>At the moment, that doesn't happen and background I/O will swamp even an
>attempt at interactive priority I/O due to the device forever seeking.
            ^^^^^^^^^^^^^^^^^^^^^^^^

You are definitely talking about 2.2.x. 2.3.x recognizes pageins and puts
them at the top of the queue despite of the background I/O. Try and give
me testcases (instead of words :) if that's not true. Interactiveness is
just fine in 2.3.49 (on IDE at least) as far I can tell.

The reason in 2.3.49 you are not getting the bad seek behaviour you are
worried about is thanks to readahead/around and thanks to the cache that
let us to fault in the indirection blocks only once in a while.

Again: if you have real life seek problems with 2.3.49 let me know and I'll
be glad to work to fix them, but I currently don't buy the holdoff time
betweek large seeks as a solution for the seeks since it would harm
too much the common case and for the indirect blocks with fs interaction case
it seems not worty enough. I can certainly change idea in the future if real
life prove I am wrong, but that's my current opinion with the picture I have
in mind.

Thanks.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:16 EST