PROBLEM: SysV semaphore race vs SIGSTOP

From: Ove Kaaven
Date: Fri Jan 28 2005 - 17:57:28 EST


There seem to be a race when SIGSTOP-ing a process waiting for a SysV
semaphore. Even if it could not possibly have owned the semaphore when
the signal was sent (because the sender of the signal owned it at the
time), it still occasionally happens that it both stops execution *and*
acquires the semaphore, with a deadlocked application as the result.
This is a problem for some of the high-performance stuff I'm working on.

A sample test program exhibiting the problem is available at
http://www.ping.uio.no/~ovehk/sembug.c

For me, it will show "ACQUIRE FAILED!! DEADLOCK!!" almost every time I
run it. Occasionally it will run fine; if it does for you, just try
again a couple of times.

The kernel I currently use is:

Linux version 2.4.27-1-k7 (horms@xxxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc
version 3.3.5 (Debian 1:3.3.5-2)) #1 Wed Dec 1 20:12:01 JST 2004

and I run it on a uniprocessor system (AMD Athlon, 1.9GHz) with Debian
"sid" installed.

I'm not a kernel hacker, but from a quick peruse of the 2.4 code, it
didn't seem to me like the semaphore code in the kernel (ipc/sem.c) even
try to handle suspended threads (though I wouldn't know how to do so).
The 2.6 semaphore code looked almost the same to me, too, so it might be
a problem there as well.

Please Cc me on any questions or comments, since I am too wimpy to
subscribe yet.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/