Re: Ideas for reducing memory copying and zeroing times

Robert L Krawitz (rlk@tiac.net)
Tue, 16 Apr 1996 09:39:13 -0400


Date: Tue, 16 Apr 96 01:35 BST
From: Jamie Lokier <jamie@rebellion.co.uk>

Well, copy-on-write of zero-mapped pages obviously happens a great deal.
So it's worth writing the fastest page-zeroing code that anyone can
think up. (I haven't timed it, but it seems to me that even the
`memset' in <asm-i386/strings-i486.h> might go faster on a Pentium if it
is unrolled a little and uses paired writes, simply because many of the
zeroes may well get written to the internal cache during the loop, and
get written to secondary cache, etc., later while other code is happily
doing other things in the internal cache).

The best way that I've found (which is twice as fast as anything else)
is to use the FPU to zero out pages. I get 60 MB/sec throughput that
way (15000 pages/sec) vs. 30 MB/sec by any other way. The problem
with memory writes from the Pentium is that they only go 32 bits at a
time unless you're flushing a cached line. Since the Pentium cache is
not write allocate, if you write to an uncached location, it writes
through to the location.

Actually, I don't think a write allocate cache is very effective for
memory copy and block zero. The reason is that write allocate pulls
the data in from main memory, which is a waste when copying or
clearing memory. Better to have a Pentium-type cache and use a 64-bit
instruction (fistpq or fstd or the like).

Apart from that though, how about having the idle task (or a
low-priority kernel thread) fill out a pool of pre-zeroed pages. When a
process needs a zero page, if there are any in the pool it can have one
immediately by remapping a page -- no copy on write required. Of
course, under constant load the pool would be empty so you still need
the fast zeroing code. At least at the start of a burst of activity
there would be a much reduced zeroing time (such as when a program
starts up and fills its data area). And with SMP even if all but one
CPU is loaded, there might be a spare one with enough idle time to keep
the pool going for the others.

Interesting idea, that.

-- 
Robert Krawitz <rlk@tiac.net>           http://www.tiac.net/users/rlk/

Member of the League for Programming Freedom -- mail lpf@uunet.uu.net Tall Clubs International -- tci-request@aptinc.com or 1-800-521-2512