Re: [RFC] x86-64: Use SSE for copy_page and clear_page

From: Benjamin LaHaise
Date: Tue May 31 2005 - 09:00:34 EST


On Tue, May 31, 2005 at 11:23:58AM +0200, Andi Kleen wrote:
> fork is only a corner case. The main case is a process allocating
> memory using brk/mmap and then using it.

At least for kernel compiles, using non-temporal stores is a slight
win (a 2-5s improvement on 4m30s). Granted, there seems to be a
lot of variation in kernel compile times.

A bit more experimentation shows that non-temporal stores plus a
prefetch of the resulting data is still better than the existing
routines and only slightly slower than the pure non-temporal version.
That said, it seems to result in kernel compiles that are on the high
side of the variations I normally see (4m40s, 4m38s) compared to the
~4m30s for an unpatched kernel and ~4m25s-4m30s for the non-temporal
store version.

-ben
--
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/