Re: [PATCH RFC] [X86] performance improvement for memcpy_64.S byavoid memory miss predication.

From: Ingo Molnar
Date: Tue Oct 20 2009 - 02:33:41 EST



* Ling Ma <linguranus@xxxxxxxxx> wrote:

> Hi Ingo
> Thanks for your suggestion. I used 'perf stat --repeat 10
> /develop/trunk/memcpy/static' to measure before/after patch.
>
> The test program I wrote:
> for (i = 64; i < 4096 *4; i ++)
> do_memcpy(src, dst, i);
>
> when src offset is 0xbe000, dst is 0xad008, the measured result:
>
> Before patch:
> Performance counter stats for '/develop/trunk/memcpy/static' (10 runs):
> <not counted> task-clock-msecs
> <not counted> context-switches
> <not counted> CPU-migrations
> <not counted> page-faults
> <not counted> cycles
> <not counted> instructions
> <not counted> cache-references
> <not counted> cache-misses
> 37.408743997 seconds time elapsed ( +- 0.222% )

hm, on what kind of CPU have you run this? Why are those events not
counting? Is it some older, Pentium-4 alike CPU perhaps?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/