Re: Why does Linux not implement pthread_suspend() and pthread_resume()?

From: Theodore Ts'o
Date: Sun Jan 14 2024 - 21:35:47 EST


On Sun, Jan 14, 2024 at 12:20:04PM +0100, Dr. Henning Kopp wrote:
>
> I found one answer on stackoverflow [1] that mentions that pthread_suspend
> and pthread_resume_np is in the "Unix specification"

This is not correct. It is *not* part of the Single Unix
Specification. The listing what is in the SUS can be found here:

https://pubs.opengroup.org/onlinepubs/7908799/xsh/threads.html

HPUX seems to have implemented pthread_suspend, but it is not a formal
part of the Posix or Single Unix Specification's definition of Posix
Threads.

> I read "man 7 pthreads". It mentions that there are two Linux
> implementations of Posix threads, that differ in some details from the Posix
> spec. However, it does not mention suspending or resuming threads at all.

It states that LinuxThreads was shipped as part of glibc until 2.4.
Since 2.4, NPTL is the implementation that has been shipped. Note
that glibc 2.4 was shipped in 2006. This is just that man page is
quite old, and there is some information which is mostly ancient
history that hasn't been removed yet. There is also comments about
various aspects of NPTL that weren't fully POSIX compliant until
various 2.6.x kernels --- well the Linux 3.0 kernel was released in
2011, so again, there's just a lot of stuff there which can be safely
ignored as no longer relevant.

> So my question is: What is the reason that Linux does not implement
> functions for suspending and resuming threads?

Quoting from the Linux Threads FAQ:

E.4: How can I suspend and resume a thread from another thread?
Solaris has the thr_suspend() and thr_resume() functions to do
that; why don't you?

The POSIX standard provides no mechanism by which a thread A can
suspend the execution of another thread B, without cooperation from
B. The only way to implement a suspend/restart mechanism is to have
B check periodically some global variable for a suspend request and
then suspend itself on a condition variable, which another thread
can signal later to restart B.

Notice that thr_suspend() is inherently dangerous and prone to race
conditions. For one thing, there is no control on where the target
thread stops: it can very well be stopped in the middle of a
critical section, while holding mutexes. Also, there is no
guarantee on when the target thread will actually stop. For these
reasons, you'd be much better off using mutexes and conditions
instead. The only situations that really require the ability to
suspend a thread are debuggers and some kind of garbage collectors.

https://www.enseignement.polytechnique.fr/profs/informatique/Leo.Liberti/public/computing/parallel/threads/linuxthreads/linuxthreads-FAQ.html#E

Yes, LinuxThreads has been obsolete since 2006. But the rationale
that (a) suspending threads is dangerous, and handing a footgun to
application writesr might not be wise, and (b) therefore Posix
specification does not include the capability to suspend a thread, is
still true.

There is a non-standard way that you can suspend an individual thread
--- you need to get the Linux tid (note: *not* the pthread id; that's
different) and then you can send a SIGSTOP signal to the Linux thread
which will cause the kernel to suspend the thread, and you can send it
a SIGCONT thread that will cause the kernel to resume the thread.
This is not something that can be intercepted by the application, and
so there's nothing the Posix Thread library can do to make things
better for the application. If the application happens to be in a
critical section, holding some mutex when it no longer is allowed to
run, it might cause your program to wedge until it the SIGCONT signal
is send to the thread, and this might not be what you want.

Cheers,

- Ted