Re: mmap/write vs read/write supprise

Adam Mckee (mckee@SEDSystems.ca)
Fri, 11 Jul 1997 10:21:09 -0600 (CST)


Hi there.

I believe he was referring to statistical compression methods, where you
are figuring probabilities of each symbol and using this information to
decide how costly it should be to encode each symbol. Given non-uniform
probabilities, you should get compression of some kind. Huffman encoding
is the simplest example of this idea. gzip and compress use LZ-77
derived techniques, which are not (at their heart) statistical at all.
(LZ-77 type compressors often use a Huffman encoder or similar though).

If access patterns are non-random (which they are), you should be able
use a statistical technique to make a reasonable guess as to which
pages are most likely to be accessed next. An order-n (n > 0) model
would be most effective, but the memory needed grows exponentially with n.

-- Adam

The same kind of statistical technique can be used to dynamically adapt
to changing access patterns.

On Fri, 11 Jul 1997, Rogier Wolff wrote:

> Eric.Schenk@dna.lth.se wrote:
> >
> >
> > mcculley@iag.net <mcculley@iag.net> writes:
> > >If we had madvise, the responsibility for determining the access
> > >pattern could be put on the application. We could still use a better
> > >default access pattern for mmap'd pages, though. Is the VM code
> > >sophisticated enough yet to take advantage of an madvise syscall?
> >
> > Another approach to this would be to use a predictive algorithm
> > to determine which pages to prefetch. There was a neat article
> > on doing this in JACM recently. The idea is to use the core of
> > a good compression algorithm as the predictor. Apparently
> > this can be proved to be within a constant of the optimal
> > prefect strategy (i.e. the one that knows all future
> > fetches in advance). I don't know off hand what the real
> > performance of this would be like for kernel VM, but
> > someone ambitious with a lot of time on their hands
> > may want to play :)
>
> Compression algorithm (compress, gzip) work by assigning "short
> bitstreams" to "strings that we've recently seen".
>
> I don't see how this can be adapted to predicting access patterns.
>
> ROger.
>
>
> >
> > --
> > Eric Schenk www: http://www.dna.lth.se/~erics
> > Dept. of Comp. Sci., Lund University email: Eric.Schenk@dna.lth.se
> > Box 118, S-221 00 LUND, Sweden fax: +46-46 13 10 21 ph: +46-46 222 96 38
> >
> >
>
>