Re: [PATCH] better align percpu counter (Was Re: [tip:sched/core]sched: cpuacct: Use bigger percpu counter batch values for statscounters

From: Ingo Molnar
Date: Fri Aug 21 2009 - 07:29:57 EST



* KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> On Thu, 20 Aug 2009 12:04:03 +0200
> Ingo Molnar <mingo@xxxxxxx> wrote:
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > > with your program
> > > before patch.
> > > cpuacct off : 414000-416000 ctsw per sec.
> > > cpuacct on : 401000-404000 ctsw per sec.
> > >
> > > after patch
> > > cpuacct on : 412000-413000 ctsw per sec.
> > >
> > > Maybe I should check cache-miss late ;)
> >
> > Btw., in latest upstream you can do that via:
> >
> > cd tools/perf/
> > make -j install
> >
> > perf stat --repeat 5 -- taskset -c 1 ./context_switch
> >
>
> tried. (on 8cpu/2socket host). It seems cache-miss decreases. But
> IPC ..?

All the numbers have gone down - about the same amount of cycles but
fewer instructions executed, and fewer cache-misses. That's good.

The Instructions Per Cycle metric got worse because cycles stayed
constant. One thing is that you have triggered counter-over-commit
(the 'scaled from' messages) - this means that more counters are
used than the hardware has space for - so we round-robin schedule
them.

If you want to get to the bottom of that, to get the most precise
result try something like:

perf stat --repeat 5 -a -e \
cycles,instructions,L1-dcache-load-misses,L1-dcache-store-misses \
-- ./ctxt_sw.sh

( this is almost the same as the command line you used, but without
the 'cache-misses' counter. Your CPU should be able to
simultaneously activate all these counters and they should count
100% of the events. )

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/