Re: 2.6.33-rc1 unusable due to scheduler issues, circular locking, WARNs and BUGs

From: Xiaotian Feng
Date: Tue Dec 22 2009 - 02:41:47 EST


On Tue, Dec 22, 2009 at 3:19 PM, AmÃrico Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> [Fix top-posting]
>
> On Tue, Dec 22, 2009 at 1:42 PM, Xiaotian Feng <xtfeng@xxxxxxxxx> wrote:
>>
>> On Tue, Dec 22, 2009 at 8:17 AM, Eric Paris <eparis@xxxxxxxxxx> wrote:
>>> Trying to build a kernel on a 48 core x86_64 box using make -j 64 and
>>> I'm exploding in the scheduler. ÂI'm running (and building) kernel
>>> f7b84a6ba7eaeba4e1df8feddca1473a7db369a5 ÂThere are three distinct
>>> signatures of problems. ÂSome boots I'll see all 3 of these failures
>>> sometimes only 1 or 2 of them. ÂThat's the reason they are kinda split
>>> up in dmesg.
>>>
>>> 1) gcc/3141 is trying to acquire lock:
>>> Â(&(&sem->wait_lock)->rlock){......}, at: [<ffffffff81223234>] __down_read_trylock+0x13/0x46
>>>
>>> but task is already holding lock:
>>> Â(&rq->lock){-.-.-.}, at: [<ffffffff8103dd2d>] task_rq_lock+0x51/0x83
>>>
>>> 2) WARN() in kernel/sched_fair.c:1001 hrtick_start_fair()
>>>
>>> 3) NULL pointer dereference at 0000000000000168 in check_preempt_wakeup
>>> Â Â Âkernel/sched_fair.c
>>>
>>> Full backtraces are in the attached dmesg.
>>>
>> Does a revert of cd29fe6f2637cc2ccbda5ac65f5332d6bf5fa3c6 fix this problem?
>
>
> I don't think so...
>
> I think the most suspicious commit here is ab19cb23. It kicked
> "local_irq_save()"
> out, which means if the task is selected to run on another cpu which doesn't
> disable irq, we will have a page fault, thun we will try to hold mm->mmap_sem
> while we are holding rq->lock already.

The page fault is from kernel NULL pointer deref. You should connect
the lockdep warning and kernel BUG together.

>
> Does the following untested patch fix the problem?
>
> NOT-signed-off-by: WANG Cong <xiyou.wangcong@xxxxxxxxx>
>
> ------
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 87f1f47..221ab59 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2408,13 +2408,13 @@ static int try_to_wake_up(struct task_struct
> *p, unsigned int state,
> Â Â Â Âif (p->sched_class->task_waking)
> Â Â Â Â Â Â Â Âp->sched_class->task_waking(rq, p);
>
> - Â Â Â __task_rq_unlock(rq);
> + Â Â Â task_rq_unlock(rq);
>
> Â Â Â Âcpu = select_task_rq(p, SD_BALANCE_WAKE, wake_flags);
> Â Â Â Âif (cpu != orig_cpu)
> Â Â Â Â Â Â Â Âset_task_cpu(p, cpu);
>
> - Â Â Â rq = __task_rq_lock(p);
> + Â Â Â rq = task_rq_lock(p);
> Â Â Â Âupdate_rq_clock(rq);
>
> Â Â Â ÂWARN_ON(p->state != TASK_WAKING);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/