Re: [PATCH] io stalls

From: Nick Piggin (
Date: Mon Jun 09 2003 - 22:22:37 EST

Robert White wrote:

>[]On Behalf Of Nick Piggin
>>Chris Mason wrote:
>>>The major difference from Nick's patch is that once the queue is marked
>>>full, I don't clear the full flag until the wait queue is empty. This
>>>means new io can't steal available requests until every existing waiter
>>>has been granted a request.
>>Yes, this is probably a good idea.
>Err... wouldn't this subvert the spirit, if not the warrant, of real time
>scheduling and time-critical applications?

No, my patch (plus Chris' modification) change request allocation
from an overloaded queue from semi random (timing dependant mixture
of LIFO and FIFO), to FIFO.

As Chris has shown, can cause a task to be starved for 2.7s (and
theoretically infinite) when it should be woken in < 200ms under
similar situations with the FIFO scheme.

>After all we *do* want to all-but-starve lower priority tasks of IO in the
>presence of higher priority tasks. A select few applications absolutely
>need to be pampered (think ProTools audio mixing suite on the Mac etc.) and
>any solution that doesn't take this into account will have to be re-done by
>the people who want to bring these kinds of tasks to Linux.
>I am not most familiar with this body of code, but wouldn't the people
>trying to do audio sampling and gaming get really frosted if they had to
>wait for a list of lower priority IO events to completely drain before they
>could get back to work? It would certainly produce really bad encoding of
>live data streams (etc).

Actually, there is no priority other than time (ie. FIFO), and
seek distance in the IO subsystem. I guess this is why your
arguments fall down ;)

>>From a purely queue-theory stand point, I'm not even sure why this queue can
>become "full". Shouldn't the bounding case come about primarily by lack of
>resources (can't allocate a queue entry or a data block) out where the users
>can see and cope with the problem before all the expensive blocking and

In practice, the problems of having a memory size limited queue
outweigh the benefits.

>Still from a pure-theory standpoint, it would be "better" to make the wait
>queues priority queues and leave their sizes unbounded.
>In practice it is expensive to maintain a fully "proper" priority queue for
>a queue of non-trivial size. Then again, IO isn't cheap over the domain of
>time anyway.

If IO priorities were implemented, you still have the problem of
starvation. It would be better to simply have a per process limit
on request allocation, and implement the priority scheduling in
the io scheduler.

I think you would find that most processes do just fine with
just a couple of requests each, though.

>The solution proposed, by limiting the queue size sort-of turns the
>scheduler's wakeup behavior into that priority queue sorting mechanism.
>That in turn would (it seems to me) lead to some degenerate behaviors just
>outside the zone of midline stability. In short several very-high-priority
>tasks could completely starve out the system if they can consistently submit
>enough request to fill the queue.
>[That is: consider a bunch of tasks sleeping in the scheduler because they
>are waiting for the queue to empty. When they are all woken up, they will
>actually be scheduled in priority order. So higher priority tasks get first
>crack at the "empty" queue. If there are "enough" such tasks (which are IO
>bound on this device) they will keep getting serviced, and then keep going
>back to sleep on the full queue. (And god help you if they are runaways
>8-). The high priority tasks constantly butt in line (because the scheduler
>is now the keeper of the IO queue) and the lower priority tasks could wait

No, they will be woken up one at a time as requests
become freed, and in FIFO order. It might be possible
for a higher (CPU) priority task to be woken up
before the previous has a chance to run, but this
scheme is no worse than before (the solution here is
per process request limits, but this is 2.4).

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:22 EST