Re: [RFC] [PATCH] Add memcpy32 function

From: Andreas Kleen
Date: Wed Dec 28 2005 - 13:31:32 EST


Am Mi 28.12.2005 19:11 schrieb Bryan O'Sullivan <bos@xxxxxxxxxxxxx>:

> On Wed, 2005-12-28 at 18:50 +0100, Andreas Kleen wrote:
>
> > Ok thanks. And do you have numbers that show that the assembly
> > function with rep ; movsl actually improves performance over C?
>
> I'll see if I can ferret some numbers out. If not, I'll generate them,
> but it will take me a day or so. I'm pretty sure it makes a difference
> of tens to hundreds of nanoseconds, which in our case is very
> significant (we measure some of our user-level performance in
> increments
> of 10ns, very repeatably).

If you test the C version use

CFLAGS_memcpy32.o := -funroll-loops

BTW on x86-64 with CONFIG_UNORDERED_IO writel can actually expand to a
non temporal write which might break it.

-Andi


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/