Re: Buffer cache hints

Don Fisher (
Sat, 07 Sep 1996 07:28:14 -0700

Is there a way to control the number of bytes mmap() reads on each page
If you are doing something with large data sets (i.e. images) it helps
perform large xfers. I have been afraid of using mmap when I know my
will perform random access over 1-4 MBytes of data. I would like to map
file, read the first byte and fault in the rest with a single read.

Also, if you want to read a file, modify its contents and have the
go to a second file, is there a way to mmap() 2 files to the same
address space,
one file for reads and the other for writes?


Richard Gooch wrote:
> Linus Torvalds writes:
> >
> > On Sat, 7 Sep 1996, Richard Gooch wrote:
> > >
> > > I do use mmap() sometimes, but I still have to swap the bytes (I
> > > just use mmap() and bcopy() instead of read() to read from the
> > > file). It would be so nice if the data was in host-natural form, but
> > > alas, no.
> > > I have one file which is 30 MBytes, and my disc rattles like crazy for
> > > a few minutes before all the data has been read and swap-copied into
> > > VM. If it wasn't for the unneccesary paging, this would take 15 to 20
> > > seconds with my machine with 64 MBytes of RAM.
> >
> > You'd still be better off with mmap + byte swap in place, than with read
> > + byte swap. Rationale:
> >
> > With "read(large-area)" + "massage(large-area)", you end up swapping things
> > out _twice_. When you do the read, the data in the beginning of the read
> > buffer gets swapped out when the kernel has to copy the data to the end of
> > the read buffer, and then when you do the byte-order stuff it has to be
> > swapped in again (and the end of the read buffer gets swapped out).
> >
> > If you do a mmap(MAP_PRIVATE, PROT_READ|PROT_WRIE), the kernel won't actually
> > read the data until you need it, so it will be read just once, and then
> > directly massaged without hitting swap in between. The kernel will start
> > swapping out the (massaged) pages by the time you've reached the end, but
> > you'd still have "won" one swap-out.
> This is possible under i386_Linux, but my code has to work on other
> platforms too, where I don't just byte-swap, but also resize (i.e. if
> sizeof (float) == 8). It could make things very messy.
> > Also, if you _know_ that you'll then use the data in some specific sequence,
> > you can try to minimize this swapping stage by doing the byte swap in
> > reverse: that way when you have byte-swapped all the data you're likely to
> > have the start of the data buffer in memory (because that's the part you
> > touched the latest). NOTE: this only makes sense if you know that the swap
> > is a problem, because generally it's slower going backwards than forward if
> > there are no swap effects.
> It's possible to do this (a fair bit of stuffing around, though),
> but this scheme might reduce performance with other operating systems
> where the MM is different.
> > You can do the same thing with read (read in small chunks and do the data
> > massage in small chunks), but it's generally easier with mmap. And it's a lot
> > more likely that you'll see a mmap cache hint in the future than a buffer
> > cache hint..
> In fact, I already do it in chunks (4 MBytes at the moment): read a
> bit, massage a bit, and then to the next chunk. This improved things
> considerably, but there is still room to go.
> Well, since my code "knows" that Linux supports mmap(), I could cope
> with a mmap only hint, since my "reads" are really bcopy() from mmaped
> region when the file is mmaped.
> I do wonder: how much harder is it to have such a hint for the buffer
> cache?
> What is the hope of getting a mmap hint in the near future?
> Regards,
> Richard....

*    Don Fisher 
*    Science Applications International Corporation		(520)570-7699       
*    5151 E. Broadway,  Suite
*    Tucson, AZ