Re: sched: spinlock recursion in sched_rr_get_interval

From: Li Bin
Date: Sat Dec 27 2014 - 04:03:51 EST


On 2014/12/26 15:01, Sasha Levin wrote:
> On 12/26/2014 01:45 AM, Li Bin wrote:
>> On 2014/7/8 4:05, Peter Zijlstra wrote:
>>>> On Mon, Jul 07, 2014 at 09:55:43AM -0400, Sasha Levin wrote:
>>>>>> I've also had this one, which looks similar:
>>>>>>
>>>>>> [10375.005884] BUG: spinlock recursion on CPU#0, modprobe/10965
>>>>>> [10375.006573] lock: 0xffff8803a0fd7740, .magic: dead4ead, .owner: modprobe/10965, .owner_cpu: 15
>>>>>> [10375.007412] CPU: 0 PID: 10965 Comm: modprobe Tainted: G W 3.16.0-rc3-next-20140704-sasha-00023-g26c0906-dirty #765
>>>>
>>>> Something's fucked; so we have:
>>>>
>>>> debug_spin_lock_before()
>>>> SPIN_BUG_ON(lock->owner == current, "recursion");
>>>>
>> Helloï
>> Does ACCESS_ONCE() can help this issue? I have no evidence that its lack is
>> responsible for the issue, but I think here need it indeed. Is that right?
>>
>> SPIN_BUG_ON(ACCESS_ONCE(lock->owner) == current, "recursion");
>
> Could you explain a bit more why do you think it's needed?
>

Oh, just adding ACCESS_ONCE may be not enough, and i think lacking lock protection
for reading lock->owner is a risk. In short, the reason of the issue is more like
the spinlock debug mechanism, rather than a real spinlock recursion.

...
//under no lock protection
if (lock->owner == current) //access lock->owner
|-spin_dump(lock, "recursion");
|-if (lock->owner && lock->owner != SPINLOCK_OWNER_INIT) //access lock->owner again
owner = lock->owner;
...

Right, or am I missing something?
Thanks,
Li Bin

>
> Thanks,
> Sasha
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/