Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lockif possible

From: Ingo Molnar
Date: Thu Mar 24 2011 - 16:00:47 EST



* Jack Steiner <steiner@xxxxxxx> wrote:

> >
> > This cacheline bouncing was actually observed and measured
> > on SGI UV systems, but I'm not certain we're permitted to publish
> > that data. I'm copying the two SGI guys who had reported that
> > issue (and the special case fix, which Nikanth simply generalized)
> > to us, for them to decide.
>
> We frequently run into the cacheline bouncing issues. I don't have
> the data handy that you refer to, but feel free to publish it.

One good way to see cache bounces is to run a misses/accesses ratio profile:

perf top -e cache-misses -e cache-references --count-filter 10

Note the two events: this runs a 'weighted' profile, you'll see (LLC)
cache-misses of a function relative to cache-references it does, a
misses/references ratio in essence.

The --count-filter filters out rare entries. (so that rare functions
accidentally producing a large ratio do not clutter the output)

For example during a scheduler-intense workload you'll get something like:

PerfTop: 32652 irqs/sec kernel:71.2% exact: 0.0% [cache-misses/cache-references], (all, 16 CPUs)
-------------------------------------------------------------------------------------------------------

weight samples pcnt function DSO
______ _______ _____ ____________________________ ____________________

1.9 606 3.2% irqtime_account_process_tick [kernel.kallsyms]
1.6 854 4.4% update_vsyscall [kernel.kallsyms]
1.5 446 2.3% atomic_cmpxchg [kernel.kallsyms]
1.5 758 3.9% tick_do_update_jiffies64 [kernel.kallsyms]
1.4 149 0.8% arch_local_irq_save [kernel.kallsyms]
1.3 1524 7.9% do_timer [kernel.kallsyms]
1.2 215 1.1% clear_page_c [kernel.kallsyms]
1.2 128 0.7% dso__find_symbol /home/mingo/bin/perf
1.0 281 1.5% calc_global_load [kernel.kallsyms]
0.9 560 2.9% profile_tick [kernel.kallsyms]
0.7 246 1.3% _raw_spin_lock [kernel.kallsyms]
0.6 2523 13.1% current_kernel_time [kernel.kallsyms]

This output is very different from a plain cycles (or even cache-misses)
measured profile and is very good at identifying 'bouncy' cache-miss sources.

Another good 'view' is store-references against store-misses:

PerfTop: 29530 irqs/sec kernel:99.5% exact: 0.0% [L1-dcache-store-misses/L1-dcache-stores], (all, 16 CPUs)
-------------------------------------------------------------------------------------------------------

weight samples pcnt function DSO
______ _______ _____ ________________________ __________________________________

1271.3 3814 3.2% apic_timer_interrupt [kernel.kallsyms]
844.0 844 0.7% read_tsc [kernel.kallsyms]
615.0 615 0.5% timekeeping_get_ns [kernel.kallsyms]
520.0 520 0.4% intel_pmu_disable_all [kernel.kallsyms]
390.0 390 0.3% tick_dev_program_event [kernel.kallsyms]
308.3 1850 1.5% update_vsyscall [kernel.kallsyms]
251.7 755 0.6% hrtimer_interrupt [kernel.kallsyms]
246.0 246 0.2% find_busiest_group [kernel.kallsyms]
222.7 668 0.6% native_apic_mem_write [kernel.kallsyms]
149.0 298 0.2% apic_write [kernel.kallsyms]
137.0 274 0.2% irq_enter [kernel.kallsyms]
105.0 105 0.1% arch_local_irq_save [kernel.kallsyms]
101.0 101 0.1% tick_program_event [kernel.kallsyms]
95.5 191 0.2% ack_APIC_irq [kernel.kallsyms]

You might want to experiment around with the events to see which one expresses
things best for you on the system in question.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/