Re: [PATCH 02/14] perf/bench: Default to all routines in 'perf bench mem'

From: Linus Torvalds
Date: Mon Oct 19 2015 - 11:21:45 EST


On Mon, Oct 19, 2015 at 1:04 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> triton:~> perf bench mem all
> # Running mem/memcpy benchmark...
> Routine default (Default memcpy() provided by glibc)
> 4.957170 GB/Sec (with prefault)
> Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
> 4.379204 GB/Sec (with prefault)
> Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
> 4.264465 GB/Sec (with prefault)
> Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
> 6.554111 GB/Sec (with prefault)

Is this skylake? And why are the numbers so low? Even on my laptop
(Haswell), I get ~21GB/s (when setting cpufreq to performance).

It's interesting that 'movsb' for you is so much better. It's been
promising before, and it *should* be able to do better than manual
copying, but it's not been that noticeable on the machines I've
tested. But I haven't ued Skylake or Broadwell yet.

cpufreq might be making a difference too. Maybe it's just ramping up
the CPU? Or is that really repeatable?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/