Direct I/O

From: Alexander S A Kjeldaas (Alexander.Kjeldaas@fast.no)
Date: Thu Mar 09 2000 - 13:56:46 EST


I'm looking for opinions about how direct I/O can be implemented on
Linux. I am interested in a design that does not touch the
page-tables of the process, and where the data is DMAed directly to
user-space visible pages.

My goal is to have an interface that lets me DMA directly from disk
into the CPU-cache using a combination of direct I/O on a static
buffer (held in the CPU cahce), and bus-snooping catching the new data
as it flies past the CPU during the DMA (if the CPU is able to get to
the data on the bus during snooping).

Searching for previous discussions on this, I found that some thought
that DMAing into user space is a nono because of the inability of the
kernel to choose "DMA-friendly" pages on behalf of the user process. I
have given this some thought and came up with an alternative using
mmap.

The idea might be ugly, but it might be better than nothing :-)

o An mmap hint (MAP_BLOCK?) that requests that the mmap should block
until the region requested has been read into memory.

o An mmap flag (MAP_NOSYNC?) that sais that we want to write to the
returned memory area, but we don't want those changes to be written
back to file (this means that the returned memory area works lika a
normal buffer used when read()ing).

o An mmap hint (MAP_DIRECT? or O_DIRECT taken from the struct file*)
that sais that we do not want copy-on-write semantics on the
pages. Thus we do not want the pages given to us by mmap to be
registered with the page cache.

o An mmap optimization that works like this: If MAP_FIXED is given and
there already is a mapping at the address, reuse the pages from that
particular mapping when calling readpage() instead of allocating a new
page. This means that we don't need to do any page-flipping and
invalidate tlb entries during the mmap.

The point of the above is to use an initial mmap MAP_DIRECT to get the
kernel to allocate "DMA-friendly" pages. After the kernel has done
that, use MAP_FIXED|MAP_DIRECT|MAP_BLOCK on the "DMA-friendly" pages
to simulate a direct I/O read() that bypasses the page cache and gets
the result directly to user space. In order to do some processing of
the data, MAP_NOSYNC might be required, but MAP_NOSYNC should be
essentially free if MAP_DIRECT is used since the pages are not owned
by the page cache anyway.

Could something like the above be an acceptable alternative to direct
I/O to user-space-provided pages?

astor

-- 
Alexander Kjeldaas                Mail:  astor@fast.no
Systems Engineer                  Web:   http://www.fast.no/       
Fast Search & Transfer ASA        Phone: +47 21 61 02 00
P.O. Box 4452                     Fax:   +47 21 61 02 01
NO-7418 Trondheim, NORWAY         Mob:   +47 922 19 380

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:16 EST