Re: (reiserfs) Re: An elevator algorithm

From: Xuan Baldauf (xuan--reiserfs@baldauf.org)
Date: Sat Sep 16 2000 - 11:31:46 EST

Next message: Michael Eisler: "Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee"
Previous message: Trever Adams: "Re: Very aggressive swapping after 2 hours rest"
In reply to: Ragnar Kjørstad: "Re: (reiserfs) An elevator algorithm"
Next in thread: Hans Reiser: "Re: (reiserfs) Re: An elevator algorithm"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Ragnar Kjørstad" wrote:

> On Sat, Sep 16, 2000 at 01:17:53PM +0200, Xuan Baldauf wrote:
> > I'm not a kernel hacker (and therefore I'm not familiar with the kernel
> > terminology), and maybe this idea is already old, but here is an
> > algorithm for an elevator which tries to guarantee smoothness and no
> > stalling:
> >
> > Every rw-request gets an expiry timeout (e.g. in jiffies) where it's
> > completion must have started. Every request is member of two sorted
> > lists which support fast add|remove and iterating to the previous|next
> > member (linked list, binary tree, etc.):
> > The request list sorted by expiry and the request list sorted by block
> > number. When a rw-access is requested, the request gets its timeout and
> > is inserted in those two lists. The elevator has a current request on
> > which it is working. When the elevator is finished, it removes the
> > current request from the two lists and gets the "current time" (in
> > jiffies). If the head of the request list sorted by expiry has a time
> > equal to or smaller than the current time, the elevator continues with
> > that request. Else it continues with the next or previous request in the
> > list sorted by block number. (It can decide which direction, wether to
> > continue with the old direction or wether to always start with a
> > definite direction)
> >
> > This way, you have good elevator characteristics while being somewhat
> > able to guarantee maximum request duration. If the timeout expired, the
> > requested block is served immediately. Only when the system is
> > overloaded, so that the difference between the current time and the
> > oldest expiry timout exceeds a given maximum, the elevator fails. In
> > this case, the system should be throttled (inserting new requests should
> > block), I think. Users could determine the expiry-timeouts so that
> > important applications get shorter timeouts while not-so-important
> > applications which can wait can request a longer timeout.
> >
> > This algorithm is, of course, only per low-level-device.
> >
> > What do you think?
>
> If the load is to high to serve requests within the time-limit, the
> elevator-code will stop working, and everything will slow down.

If the load is temporarily too high (say some jiffies or so), this elevator
will just restart to try to guarantee serve time. From its restart on, it will
work like an usual elevator with no restart. If the load is permanently too
high, you might be right. Maybe we should enlarge the time-frame betweens no
checks for outstanding requests occur, expressed in number of blocks served or
in jiffies. The elevator will only work as an elevator for this time-frame,
then it is likely to restart on overload. The length of the time-frame defines
the trade-off between interactivity and throughput. The longer the time frame,
the more throughput, the less, the higher the interactivity. This time-frame
could be defaulted to 1/4th second (or say 32 blocks served) or so.

>
>
> You should not serve a request imidiately when it's too old (because the
> requests supposed to be served first according to the elevator is likely
> to become too old soon, and then you only add more seeking), but only
> stop inserting new requests before it.
>
> If I understand the current code correctly, it works like this:
>
> Current queue:
>
> 02:04:05:06:09:15 # sector to be written to
> 05:03:02:04:00:01 # request-nr
>
> In this example the "timeout" is 5 requests, so a new request can never
> be placed before a existing request with request-nr < new-request-nr-5;
>
> One request is served (from the head of the queue) and a request to
> sector 3 is added:
>
> 04:05:06:09:03:15 # sector to be written to
> 03:02:04:00:06:01 # request-nr
>
> One request is served (from the head of the queue) and a request to
> sector 2 is added:
>
> 05:06:09:03:15:02 # sector to be written to
> 02:04:00:06:01:07 # request-nr
>
> One request is served (from the head of the queue) and a request to
> sector 16 is added:
>
> 06:09:03:15:02:16 # sector to be written to
> 04:00:06:01:07:08 # request-nr
>
> So we've ended up with a very silly queue....

I do not understand you terminology. There is not one queue, there are two
queues. For example, I do not understand why you inserted request to sector 3
between 9 and 15. Certainly there might be a misconception (on both sides).

>
>
> Now, the description of the algorithm said that there was a number
> within each request that was declined by one whenever a new request
> passed it in the queue.

Which number? Do you mean the timeout? Do you mean decline = decrease?

> This will never be used after it becomes
> negative, so it would be the same to decline the number of all the
> requests by 1, right?

You do not want to decrease timeouts of all outstanding request unless you
travel all the request for every timer tick.

> And comparing this changing number to 0 is the
> same as comparing request-numbers, only more work, right? So I assume I
> didn't understand the algorithm correctly :)

I assume that too. Maybe my comments make you understand it now. :o)

I want to explain how my algorithm would work:
One request is described as (<requested sector>,<timeout>), the default expiry
timeout is 10 time units from the insert time on.

The current requests are (here sorted by sector):
(2,15),(4,13),(5,12),(6,14),(9,10),(15,11)
The current time is: 7
The stop-insert-time is 20. (When there is an outstanding request whose
timeout is smaller than current-time minus stop-insert-timeout, no insert
should occur, inserting should block)
Request (2,15) will be served now, because it is the "next" sector (we assume
we started from sector 0)

Sector 3 is added.

The current requests are (here sorted by sector):
(3,17),(4,13),(5,12),(6,14),(9,10),(15,11),
The current time is: 9 (serving needed two time units)
Request (3,17) will be served next because it is next to the request of sector
2 once served and no timeout expired.

Sector 2 is added.
Sector 16 is added. (Overload)

The current requests are (here sorted by sector):
(2,19),(4,13),(5,12),(6,14),(9,10),(15,11),(16,19)
The current time is: 11 (serving needed two time units)
Request (9,10) will be served next because its timeout has expired.
We start the elevator-time-frame, which is 10 time units, so it ends when
time=21. For this time, we do not check for additional outstanding requests.

Sector 7 is added.

The current requests are (here sorted by sector):
(2,19),(4,13),(5,12),(6,14),(7,21),(15,11),(16,19)
The current time is: 14 (serving needed three time units because of longer
seek)
Request (15,11) will be served next not because its timeout has expired, but
because its the next sector in the next-sector direction. The direction (next
sector or previous sector?) has not been defined for this algorithm yet.

Sector 9 is added.

The current requests are (here sorted by sector):
(2,19),(4,13),(5,12),(6,14),(7,21),(9,24),(16,19)
The current time is: 17 (serving needed three time units because of longer
seek)
Request (16,11) will be served next because its the next sector in the
next-sector direction.

Sector 8 is added.

The current requests are (here sorted by sector):
(2,19),(4,13),(5,12),(6,14),(7,21),(8,27),(9,24)
The current time is: 19 (serving needed two time units)
There are no outstanding sectors in the choosen direction. Because I currently
do not know how to choose direction, I simply revert direction. In this
special case, I prematurely "clear" the elevator-time-frame.
Request (5,12) will be served next because its the oldes request hanging
around and there is no elevator-time-frame set. We (re) start the
elevator-time-frame, because it is not currently set. It ends then time=29.

To make it short, next served requests will be (5,12),(4,13),(2,19), after
that, time=26. Because of direction inversion, the elevator-time-frame is
cleared again, (6,14) will be served next because it's the oldes request, the
elevator-time-frame is set again (now to 36), (6,14), (7,21), (8,27), (9,24)
will be served, time will be 37. At this point the elevator-time-frame has
expired and the elevator continues serving at the request with the oldest
timeout (as long as the oldest one already has expired) and ignores its
current position.

Unfortunately, I could not really demonstrate the elevator-time-frame and the
stop-insert-time, but I hope you get what I mean. :o)

>
>
> Now, lets do the same test with my suggested multiple queue approach:
>
> Current queue (full):
> 02:04:05:06:09:15 # sector to be written to
> 05:03:02:04:00:01 # request-nr
>
> In this example the "timeout" is 5 requests, so only 6 requests can be
> inserted into each queue.
>
> One request is served (from the head of the queue) and a request to
> sector 3 is added:
>
> Current queue (full)
> 04:05:06:09:15 # sector to be written to
> 03:02:04:00:01 # request-nr
> Second queue:
> 03 # sector to be written to
> 06 # request-nr
>
> One request is served (from the head of the queue) and a request to
> sector 2 is added:
>
> Current queue (full)
> 05:06:09:15 # sector to be written to
> 02:04:00:01 # request-nr
> Second queue:
> 02:03 # sector to be written to
> 07:06 # request-nr
>
> One request is served (from the head of the queue) and a request to
> sector 16 is added:
>
> Current queue (full)
> 06:09:15 # sector to be written to
> 04:00:01 # request-nr
> Second queue:
> 02:03:16 # sector to be written to
> 07:06:08 # request-nr
>
> looks much better, doesn't it?

It does look good. One active queue and one future active queue IIRC. It just
does not take into account that after serving sector 2, the new sector 3 (in
the second queue) could easily be served before sector 4. But because this is
minor, I could live with that. Your algorithm guarantees serve time for the
requests in the current queue.

>
>
> But then again, maybe I just didn't understand how the current code
> works...

I do not know also. I just read about elevators and problems and tried to
visualize the problem as a 2D diagram (x=time, y=sector) where the time moves
on and sectors
which were not served should not be forgetten within a given time.

> I'm going to shut up now..

Don't. :o)

>
>
> --
> Ragnar Kjørstad

Xuân. :o)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

Next message: Michael Eisler: "Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee"
Previous message: Trever Adams: "Re: Very aggressive swapping after 2 hours rest"
In reply to: Ragnar Kjørstad: "Re: (reiserfs) An elevator algorithm"
Next in thread: Hans Reiser: "Re: (reiserfs) Re: An elevator algorithm"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:13 EST