Re: [patch] voluntary-preempt-2.6.9-rc3-mm1-S9

From: Peter Williams
Date: Tue Oct 05 2004 - 00:34:50 EST


Rui Nuno Capela wrote:
Ingo wrote:

i've released the -S9 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9



Me again, we bad humor :(

My SMP/HT box is (again) terribly in that uglyness of being quite
unfriendly to -mm1, -mm2, and indirectly to -S8 and -S9 labeled kernels.

It works quite well with vanilla 2.6.9-rc3, though.

But very, very bad with those -mm1 or -mm2 patches. To get it straight,
almost all the time it hangs, randomly, but not as completely as to a
dramatic cold-reboot. It stalls on the the most administrative tasks. Name
one, and it stalls! I can hardly feel lucky if it sometimes reaches the
login prompt, while boot/initing.

I know you remember this story. Yeah. This seems quite similar to some of
earlier problems, but (un/fortunately) it doesn't seem related to VP at
all. Just having -mm1 or -mm2 is enough to make this machine go astray.

However, as usual, this seems to be ix86 SMP/HT specific. On my laptop, I
get to run full 2.6.9-rc3-mm2-S9 UP very happily.

Sorry if I can't get any real or useful debug data for now. The bad
behavior I'm referring to, is terribly non-deterministic, so I couldn't
get a pattern yet.

I just wanted to let you know ;)

This may be my fault. I made changes in the SMH code as part of the ZAPHOD patch but was unable to test them on a hyperthreaded machine due to a lack thereof (I have one on order). I've attached a patch that reverts the changes that I made. Can you give them a try please and let me know if they fix your problem?

Thanks
Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
--- linux-2.6.9-rc3-mm2/kernel/sched.c 2004-10-05 15:17:09.157119322 +1000
+++ linux-2.6.9-rc3-mm2-mod/kernel/sched.c 2004-10-05 15:27:05.355561830 +1000
@@ -3240,18 +3240,21 @@ need_resched:
put_task_in_sinbin(prev);

cpu = smp_processor_id();
- if (unlikely(needs_idle_balance(rq))) {
+ if (unlikely(!rq->nr_running)) {
go_idle:
idle_balance(cpu, rq);
- /* This code should get optimised away when CONFIG_SCHED_SMT
- * is not defined
- */
- if (dependent_idle(rq))
+ if (!rq->nr_running) {
+ next = rq->idle;
wake_sleeping_dependent(cpu, rq);
+ /*
+ * wake_sleeping_dependent() might have released
+ * the runqueue, so break out if we got new
+ * tasks meanwhile:
+ */
+ if (!rq->nr_running)
+ goto switch_tasks;
+ }
} else {
- /* This code should all get optimised away when CONFIG_SCHED_SMT
- * is not defined
- */
if (dependent_sleeper(cpu, rq)) {
schedstat_inc(rq, sched_goidle);
next = rq->idle;
@@ -3262,7 +3265,7 @@ go_idle:
* lock, hence go into the idle loop if the rq went
* empty meanwhile:
*/
- if (unlikely(recheck_needs_idle_balance(rq)))
+ if (unlikely(!rq->nr_running))
goto go_idle;
}