Indirect IO in Linux, some figures

Douglas Gilbert (dgilbert@interlog.com)
Sat, 03 Apr 1999 09:59:20 -0500


After reading the assertion that indirect IO could not handle
100MBytes/sec
I decided to do some testing. Direct IO (as used in the bttv driver)
bypasses kernel buffers normally used for DMA and instead DMAs directly
into the user space. Indirect IO involves the extra step of using
copy_to_user() [or copy_from_user()] to move data from the kernel DMA
buffers to the user space.

The test performed was to read 200MB ("M" == 1024**2) from the disk
buffer
of a fast wide SCSI disk. Two times are given, the first is the elapsed
time to do the DMA and transfer into the user space, the second is the
elapsed time to do the DMA. Different buffer sizes were used. Times are
in seconds ["elapsed time" from 'time sg_rbuf -q /dev/sga'].

Buffer_size to_user_space to_kernel
-------------------------------------
488KB 10.08 7.34
256KB 10.43 7.79
128KB 11.31 8.81
64KB 13.49 10.89
32KB 17.96 14.43
16KB 24.55 22.60
8KB 40.44 38.96
4KB 74.66 71.71

The computer was an AMD K6-2 300MHz, super 7 motherboard (Via) with
64MB of PCI-100 RAM. An Advansys 940UW SCSI adapter was used with an
IBM DCHS04U ultra wide disk.
The test program is called "sg_rbuf" and can be found on
http://www.netwinder.org/~dougg in the utilities page. It uses the
sg device driver to send the READ BUFFER SCSI command. The 488KB
buffer size in the 1st row of the above table is the maximum buffer
size provided by the disk in this test. Either the new sg driver
found in all "ac" versions since 2.2.2-ac6 or the original sg driver
can be used. If the original sg driver is used then the maximum buffer
size is 32KB unless SG_BIG_BUFF is increased (max 128K).

Ultra wide SCSI has a theoretical maximum throughput of 40MB/sec. The
best raw figure above (xfer into the user space) is about 20MB/sec.
If the final copy_to_user() is ignored for that buffer size then this
improves to 27MB/sec. Looking down the table that transfer into the
user space seems to cost about 2.5 seconds in each test. This implies
the throughput of copy_to_user() is about 80MB/sec on my system.

The last row times seem to be swamped by the overhead of the 51,200
READ BUFFER SCSI commands involved. So when the best time is taken into
account this suggests a SCSI command overhead of about 1.2 milliseconds
(seems a bit low to me). If this time is subtracted from the 420
SCSI commands involved in the 1st row the best throughput figure
(xfer to kernel only) improves to about 29MB/sec.

Perhaps someone out there would like to do the same test on faster
hardware, especially a U2W SCSI adapter that has a theoretical
maximum throughput of 80MB/sec.

So I guess the assertion that you need direct IO for 100MB/sec
throughput is supported by these figures.

Doug Gilbert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/