Re: ipc,sem: sysv semaphore scalability

From: Sasha Levin
Date: Tue Mar 26 2013 - 14:08:33 EST


On 03/26/2013 01:51 PM, Davidlohr Bueso wrote:
> On Tue, 2013-03-26 at 13:33 -0400, Sasha Levin wrote:
>> On 03/20/2013 03:55 PM, Rik van Riel wrote:
>>> This series makes the sysv semaphore code more scalable,
>>> by reducing the time the semaphore lock is held, and making
>>> the locking more scalable for semaphore arrays with multiple
>>> semaphores.
>>
>> Hi Rik,
>>
>> Another issue that came up is:
>>
>> [ 96.347341] ================================================
>> [ 96.348085] [ BUG: lock held when returning to user space! ]
>> [ 96.348834] 3.9.0-rc4-next-20130326-sasha-00011-gbcb2313 #318 Tainted: G W
>> [ 96.360300] ------------------------------------------------
>> [ 96.361084] trinity-child9/7583 is leaving the kernel with locks still held!
>> [ 96.362019] 1 lock held by trinity-child9/7583:
>> [ 96.362610] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8192eafb>] SYSC_semtimedop+0x1fb/0xec0
>>
>> It seems that we can leave semtimedop without releasing the rcu read lock.
>>
>> I'm a bit confused by what's going on in semtimedop with regards to rcu read lock, it
>> seems that this behaviour is actually intentional?
>>
>> rcu_read_lock();
>> sma = sem_obtain_object_check(ns, semid);
>> if (IS_ERR(sma)) {
>> if (un)
>> rcu_read_unlock();
>> error = PTR_ERR(sma);
>> goto out_free;
>> }
>>
>> When I've looked at that it seems that not releasing the read lock was (very)
>> intentional.
>
> This logic was from the original code, which I also found to be quite
> confusing.

I wasn't getting this warning with the old code, so there was probably something
else that triggers this now.

>>
>> After that, the only code path that would release the lock starts with:
>>
>> if (un) {
>> ...
>>
>> So we won't release the lock at all if un is NULL?
>>
>
> Not necessarily, we do release everything at the end of the function:
>
> out_unlock_free:
> sem_unlock(sma, locknum);

Ow, there's a rcu_read_unlock() in sem_unlock()? This complicates things even
more I suspect. If un is non-NULL we'll be unlocking rcu lock twice?

if (un->semid == -1) {
rcu_read_unlock();
goto out_unlock_free;
}
[...]
out_unlock_free:
sem_unlock(sma, locknum);


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/