Re: AIM7 40% regression with 2.6.26-rc1

From: Linus Torvalds
Date: Wed May 07 2008 - 14:19:00 EST




On Wed, 7 May 2008, Matthew Wilcox wrote:
> >
> > So it doesn't look buggy, but it looks like it might cause longer
> > latencies than strictly necessary. And if somebody depends on
> > cond_resched() to avoid some bad livelock situation, that would obviously
> > not work (but that sounds like a fundamental bug anyway, I really hope
> > nobody has ever written their code that way).
>
> Funny you should mention it; locks.c uses cond_resched() assuming that
> it ignores the BKL. Not through needing to avoid livelock, but it does
> presume that other higher priority tasks contending for the lock will
> get a chance to take it. You'll notice the patch I posted yesterday
> drops the file_lock_lock around the call to cond_resched().

Well, this would only be noticeable with CONFIG_PREEMPT.

If you don't have preempt enabled, it looks like everything should work
ok: the kernel lock wouldn't increase the preempt count, and
_cond_resched() works fine.

If you're PREEMPT, then the kernel lock would increase the preempt count,
and _cond_resched() would refuse to re-schedule it, *but* with PREEMPT
you'd never see it *anyway*, because PREEMPT will disable cond_resched()
entirely (because preemption takes care of normal scheduling latencies
without it).

And I'm also sure that this all worked fine at some point, and it's
largely a result just of the multiple different variations of BKL
preemption coupled with some of them getting removed entirely, so the code
that used to handle it just got corrupt over time. See commit 02b67cc3b,
for example.

.. Hmm ... Time passes. Linus looks at git history.

It does look like "cond_resched()" has not worked with the BKL since 2005,
and hasn't taken the BKL into account. Commit 5bbcfd9000:

[PATCH] cond_resched(): fix bogus might_sleep() warning

+ if (unlikely(preempt_count()))
+ return;

which talks about the BKS, ie it only took the *semaphore* implementation
into account. Never the spinlock-with-preemption-count one.

Or am I blind?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/