Re: [patch v2 4/5] percpu_counter: use atomic64 for counter in SMP

From: Andrew Morton
Date: Wed May 11 2011 - 11:42:15 EST


On Wed, 11 May 2011 16:10:16 +0800 Shaohua Li <shaohua.li@xxxxxxxxx> wrote:

> The percpu_counter global lock is only used to protect updating fbc->count after
> we use lglock to protect percpu data. Uses atomic64 for percpu_counter, because
> it is cheaper than spinlock. This doesn't slow fast path (percpu_counter_read).
> atomic64_read equals to read fbc->count for 64-bit system, or equals to
> spin_lock-read-spin_unlock for 32-bit system.
>
> Note, originally the percpu_counter_read for 32-bit system doesn't hold
> spin_lock, but that is buggy and might cause very wrong value accessed.
> This patch fixes the issue.
>
> This can also improve some workloads with percpu_counter->lock heavily
> contented. For example, vm_committed_as sometimes causes the contention.
> We should tune the batch count, but if we can make percpu_counter better,
> why not? In a 24 CPUs system and 24 processes, each runs:
> while (1) {
> mmap(128M);
> munmap(128M);
> }
> we then measure how many loops each process can take:
> orig: 1226976
> patched: 6727264
> The atomic method gives 5x~6x faster.

How much slower did percpu_counter_sum() become?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/