mmu_shrink() is effectively single-threaded since the global
kvm_lock is held over the entire function.
I beleive its only use here is for synchronization of the
vm_list. Instead of using the kvm_lock to ensure
consistency of the list, we instead obtain a kvm_get_kvm()
reference. This keeps the kvm object on the vm_list while
we shrink it.
Since we don't need the lock to maintain the list any more,
we can drop it. We'll reacquire it if we need to get another
object off.
This leads to a larger number of atomic ops, but reduces
lock hold times: the typical latency vs. throughput debate.
diff -puN kernel/profile.c~optimize_shrinker-3 kernel/profile.c
--- linux-2.6.git/kernel/profile.c~optimize_shrinker-3 2010-06-11 09:09:43.000000000 -0700
+++ linux-2.6.git-dave/kernel/profile.c 2010-06-11 09:12:24.000000000 -0700
@@ -314,6 +314,8 @@ void profile_hits(int type, void *__pc,
if (prof_on != type || !prof_buffer)
return;
pc = min((pc - (unsigned long)_stext)>> prof_shift, prof_len - 1);
+ if ((pc == prof_len - 1)&& printk_ratelimit())
+ printk("profile_hits(%d, %p, %d)\n", type, __pc, nr_hits);
i = primary = (pc& (NR_PROFILE_GRP - 1))<< PROFILE_GRPSHIFT;
secondary = (~(pc<< 1)& (NR_PROFILE_GRP - 1))<< PROFILE_GRPSHIFT;
cpu = get_cpu();
_