Re: sched: spinlock recursion in sched_rr_get_interval

From: Jovi Zhangwei
Date: Wed Sep 17 2014 - 05:13:46 EST


On Tue, Jul 8, 2014 at 4:05 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, Jul 07, 2014 at 09:55:43AM -0400, Sasha Levin wrote:
>> I've also had this one, which looks similar:
>>
>> [10375.005884] BUG: spinlock recursion on CPU#0, modprobe/10965
>> [10375.006573] lock: 0xffff8803a0fd7740, .magic: dead4ead, .owner: modprobe/10965, .owner_cpu: 15
>> [10375.007412] CPU: 0 PID: 10965 Comm: modprobe Tainted: G W 3.16.0-rc3-next-20140704-sasha-00023-g26c0906-dirty #765
>
> Something's fucked; so we have:
>
> debug_spin_lock_before()
> SPIN_BUG_ON(lock->owner == current, "recursion");
>
> Causing that, _HOWEVER_ look at .owner_cpu and the reporting cpu!! How
> can the lock owner, own the lock on cpu 15 and again contend with it on
> CPU 0. That's impossible.
>
> About when-ish did you start seeing things like this? Lemme go stare
> hard at recent changes.

Peter, any new update on this issue?

Recently we also found a similar deadlock in our box, but with
3.4-stable kernel.

<0>[177064.149832] BUG: spinlock recursion on CPU#11, current: IVS_RcvReq1/16444
<0>[177064.149840] lock: <NULL>/0xffff88017afd5640, .magic: dead4ead,
.owner: IVS_RcvReq1/16444, .owner_cpu: 14

Thanks.

Jovi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/