Re: Asynch I/O broken in 2.2.15

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Thu Apr 13 2000 - 10:58:57 EST


"Stephen C. Tweedie" wrote:
>
> Hi,
>
> On Wed, Apr 12, 2000 at 05:50:47PM -0600, Jeff V. Merkey wrote:
>
> > 1. Linux 2.2.15 and 2.3.99 do not support true Asynch IO.
>
> Yes they do.
>
> > Asynch IO
> > allows a caller to post a very large number of requests, is not **NOT**
> > polled, and should be interrupt driven.
>
> There is a limit on the total number of IO requests outstanding in
> the Linux kernel at once, but other than that all of the above works
> as required.
>
> > The first request should "kick"
> > the queue head into the driver, then the driver should feed the next
> > buffer on the IO completion interrupt until the queue head is empty.
>
> That is exactly what happens, with the exception that kick-starting
> an empty queue requires that the caller first submits the IOs, and
> then runs the disk task queue (as otherwise there is no way for the
> kernel to know if you are immediately going to submit another IO
> which could be merged with the first one you submitted).
>
> > The way it's implemented in Linux requires that the bottom half ISR be
> > "polled" within wait_on_buffer() or some other function to initiate the
> > I/O and block until the IO returns -- how is this asynchronous I/O.
>
> No it does not: the bh->b_end_io() function is called asynchronously
> once IO completes. You do not have to wait_on_buffer if you'd prefer
> to get a callback.
>
> > 2. While the semantics appear to be Asynch, the interface is actually
> > synchronous in behavior -- you have to poll the drives after posting IO
> > to start the disks.
>
> No you don't.
>
> > The dependency on the lock_kernel() call on 2.2.15
> > guarantees that most callers will block when attempting multiple calls
> > to ll_rw_blk(), so there went the parallelism. I have noticed that if I
> > hold the kernel lock over the I/O (which is what is happening in
> > filemap.c and buffer.c) then the swapper does not crash the system.
>
> The global kernel lock is automatically dropped by the scheduler if
> the process holding it goes to sleep. If you hold the lock and do
> blocking IO, you do not lose parallelism.
>
> > 3. Feeding more than a handful of requests at one time to THE LINUX
> > BUFFER CACHE (forget NWFS code - I tried this with just Linux) will
> > cause the swapper process to Ooops and die. I implemented some very
> > simple code that just calls the buffer cache and tries to do asynch I/O
> > and "**KAPOW!!!*** , the system crashes. The code is attached.
>
> No wonder, because your code is buggy. It does a getblk() to find
> an existing block in the buffer cache, and scribbles on the b_end_io
> field without doing any locking to see if the buffer is already in
> use by somebody else.
>
> The swapper does true async IO without oopsing. The page cache uses
> true async IO for readahead without oopsing.

I have another segment of code that uses memory I allocated (and that no
one is using or sharing) and it still Oppes. If there is a limit of
outstanding requests, then what is it, and why does this code section
Oops.

Also, your statement about the pages being locked is incorrect, these
are block addresses on NetWare partitions, no one else should be using
them or even touching them (and are not). How can what you said be
true?

Jeff

>
> --Stephen


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:21 EST