Re: AIM7 40% regression with 2.6.26-rc1

From: Matthew Wilcox
Date: Wed May 07 2008 - 12:23:42 EST


On Wed, May 07, 2008 at 08:31:05AM -0700, Andrew Morton wrote:
> On Wed, 07 May 2008 16:57:52 +0200 Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>
> > Or figure out what made the semaphore consolidation slower? As Ingo
> > pointed out earlier 40% is unlikely to be a fast path problem, but some
> > algorithmic problem. Surely that is fixable (even for .26)?
>
> Absolutely. Yanmin is apparently showing that each call to __down()
> results in 1,451 calls to schedule(). wtf?

I can't figure it out either. Unless schedule() is broken somehow ...
but that should have shown up with semaphore-sleepers.c, shouldn't it?

One other difference between semaphore-sleepers and the new generic code
is that in effect, semaphore-sleepers does a little bit of spinning
before it sleeps. That is, if up() and down() are called more-or-less
simultaneously, the increment of sem->count will happen before __down
calls schedule(). How about something like this:

diff --git a/kernel/semaphore.c b/kernel/semaphore.c
index 5c2942e..ef83f5a 100644
--- a/kernel/semaphore.c
+++ b/kernel/semaphore.c
@@ -211,6 +211,7 @@ static inline int __sched __down_common(struct semaphore *sem, long state,
waiter.up = 0;

for (;;) {
+ int i;
if (state == TASK_INTERRUPTIBLE && signal_pending(task))
goto interrupted;
if (state == TASK_KILLABLE && fatal_signal_pending(task))
@@ -219,7 +220,15 @@ static inline int __sched __down_common(struct semaphore *sem, long state,
goto timed_out;
__set_task_state(task, state);
spin_unlock_irq(&sem->lock);
+
+ for (i = 0; i < 10; i++) {
+ if (waiter.up)
+ goto skip_schedule;
+ cpu_relax();
+ }
+
timeout = schedule_timeout(timeout);
+ skip_schedule:
spin_lock_irq(&sem->lock);
if (waiter.up)
return 0;

Maybe it'd be enough to test it once ... or maybe we should use
spin_is_locked() ... Ingo?

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/