Re: [PATCH -mm -next] ipc,sem: fix lockdep false positive

From: Rik van Riel
Date: Tue Mar 26 2013 - 11:19:37 EST


On 03/26/2013 10:27 AM, Peter Zijlstra wrote:
On Tue, 2013-03-26 at 06:40 -0700, Michel Lespinasse wrote:

sem_nsems is user provided as the array size in some semget system
call. It's the size of an ipc semaphore array.

So we're basically adding a random (big) number to preempt_count
(obviously while preemption is disabled), seems rather costly and
undesirable.
>
complex semop operations take the array's lock plus every semaphore
locks; simple semop operations (operating on a single semaphore) only
take that one semaphore's lock.

Right, standard global/local lock like stuff. Is there a way we can add
a r/o test to the 'local' lock operation and avoid doing the above?

That makes me wonder, how did mm_take_all_locks used to work before
we turned the anon_vma lock into a mutex?

The code used to use spin_lock_nest_lock, but still has the potential
to overflow the preempt counter. How did that ever work right?

Maybe something like:

void sma_lock(struct sem_array *sma) /* global */
{
int i;

sma->global_locked = 1;
smp_wmb(); /* can we merge with the LOCK ? */
spin_lock(&sma->global_lock);

/* wait for all local locks to go away */
for (i = 0; i < sma->sem_nsems; i++)
spin_unlock_wait(&sem->sem_base[i]->lock);
}

void sma_lock_one(struct sem_array *sma, int nr) /* local */
{
smp_rmb(); /* pairs with wmb in sma_lock() */
if (unlikely(sma->global_locked)) { /* wait for global lock */
while (sma->global_locked)
spin_unlock_wait(&sma->global_lock);
}
spin_lock(&sma->sem_base[nr]->lock);
}

That is essentially a read-only version of the global rwlock that
I originally proposed, where the global lock takes the lock for
write and the single version takes the global lock for read, and
then one of the semaphore spinlocks.

I could certainly implement and test the above, unless Linus
thinks it's too ugly to live :)

This still has the problem of a non-preemptible section of O(sem_nsems)
(with the avg wait-time on the local lock). Could we make the global
lock a sleeping lock?

Not without breaking your scheme above :)

I suppose making things into a sleeping lock should be possible,
but that is another major change in this code. I would rather do
things in smaller steps...

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/