[PATCH] Workaround for rq->lock deadlock

From: Gregory Haskins
Date: Tue Aug 07 2007 - 14:08:38 EST


The following patch converts double_lock_balance to a full DP alogorithm to
work around a deadlock in the scheduler when running on an 8-way SMP system.

I think the original algorithm in this function is technically correct. So
really this patch is plastering over another lurking issue. However, it does
fix the observed deadlock on our systems here so I thought I would at least
share the discovery.

The actual problem is probably related to a code path which takes
task_rq_locks without using the balancer code. It might also be a race
between an rq_lock and something else. TBD

Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx>

---

kernel/sched.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 6f2cf6a..e946e3f 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2507,14 +2507,19 @@ static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
BUG_ON(1);
}
if (unlikely(!spin_trylock(&busiest->lock))) {
- if (busiest < this_rq) {
- spin_unlock(&this_rq->lock);
- spin_lock(&busiest->lock);
- spin_lock(&this_rq->lock);
+ struct rq *rq_l = busiest < this_rq ? busiest : this_rq;
+ struct rq *rq_h = busiest > this_rq ? busiest : this_rq;

- return 1;
- } else
- spin_lock(&busiest->lock);
+ spin_unlock(&this_rq->lock);
+
+ while (1) {
+ if (spin_trylock(&rq_l->lock)) {
+ if (spin_trylock(&rq_h->lock))
+ return 1;
+ else
+ spin_unlock(&rq_l->lock);
+ }
+ }
}
return 0;
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/