Re: Asynch I/O broken in 2.2.15

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Thu Apr 13 2000 - 10:51:24 EST


Steve Dodd wrote:
>
> On Wed, Apr 12, 2000 at 05:50:47PM -0600, Jeff V. Merkey wrote:
>
> > 1. Linux 2.2.15 and 2.3.99 do not support true Asynch IO.
> [..]
>
> That's not my interpretation, certainly:
>
> - You ('one') set up a bunch of buffer_heads structures
> - You submit them to ll_rw_block. For each bh, it passes it to the block
> dev driver or queues it, depending on the state of the device.
> - You do the run_task_queue(tq_disk) thing to start I/O on any bhs that
> were queued by ll_rw_block (this presumably allows requests on the queue
> to be coalesced, reordered, etc.)
> - The driver calls bh->b_end_io whenever it completes I/O on a bh.
> - You do something else, or schedule(), etc., while you wait..
>
> The b_end_io function can make use of b_wait to wake up a process when the
> I/O completes. That seems pretty async to me. I'll admit that the
> run_task_queue(tq_disk) chould perhaps be hidden behind a macro with a clearer
> name.
>

Yes. but you have to hold the kernel lock over the entire IO operation
on SMP systems, including the callback or 2.2.15 crashes, which means
you have to pass a bunch of buffer heads for each instance you hold the
kernel lock and hold the lock until the I/O completes. This blocks all
callers above you -- this is not very parallel or asynchronous.

> > The way it's implemented in Linux requires that the bottom half ISR be
> > "polled" within wait_on_buffer() or some other function to initiate the
> > I/O and block until the IO returns -- how is this asynchronous I/O.
>
> I don't think wait_on_buffer polls anything; it starts any pending I/O
> and then sleeps; the b_end_io callback wakes it up when the I/O is finished.
> The only thing I'm not clear on is the purpose of the do / while loop - under
> what situations would it get woken up without the I/O having completed?
>
> --

It calls schedule() like a demon from hell in a "polling loop" while it
sucks up processor cycles calling run_task_queue(&tq_disk) repeatedly.
go look at the code -- buffer.c line 133.

>

The way it's working right now, if you A) send it too many buffer heads
at one time (like 4000) or fail to hold lock_kernel() over the entire
IO, including the callback, Linux swapper crashes with corrupted memory.

IMHO this is busted. You should be able to inject buffer heads without
A) holding a sleep lock over the entire IO and B) forcing serialization
in upper layer FS's by gating everyone through the kernel lock and C)
the disk queues for servicing requests should "kick" the buffer head
into the driver the first time someone puts a disk request on the queue,
allow the IO completion interrupts to feed the entries off the list
(i.e. each IO complete interrupt gets the next pending request). When
the queue is empty, the driver kicks it back to the IO layer until
someone puts a request on it again. We should also allow folks to
continue to queue on the disk head if the driver has it and until the
driver empties it. This will produce an event driven asychronous driver
interface that's fast, and does not need polling methods to feed it IO.
What's there support asynch under the kernel lock, which limits
parallelism.

Jeff

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:21 EST