William Lee Irwin III wrote:
Split counters easily resolve the issues with both these approaches
(and apparently your co-workers are suggesting it too, and have
performance results backing it).
On Sat, Nov 20, 2004 at 01:18:22PM +1100, Nick Piggin wrote:
Split counters still require atomic operations though. This is what
Christoph's latest effort is directed at removing. And they'll still
bounce cachelines around. (I assume we've reached the conclusion
that per-cpu split counters per-mm won't fly?).
Split != per-cpu, though it may be. Counterexamples are
as simple as atomic_inc(&mm->rss[smp_processor_id()>>RSS_IDX_SHIFT]);
Furthermore, see Robin Holt's results regarding the performance of the
atomic operations and their relation to cacheline sharing.
And frankly, the argument that the space overhead of per-cpu counters
is problematic is not compelling. Even at 1024 cpus it's smaller than
an ia64 pagetable page, of which there are numerous instances attached
to each mm.