Re: Slow pthread_create() under high load

From: Linus Torvalds (torvalds@transmeta.com)
Date: Wed Mar 29 2000 - 14:13:12 EST


On Wed, 29 Mar 2000, Christopher Smith wrote:
>
> The main problem is, assuming this is being done in userland (as you
> pointed out it's silly to constantly do syscalls to get your PID
> unless you REALLY expect it to change a lot and it's hard to trap the
> points where it changes) while internally the application may believe
> it's all the same PID, to external programs there are still multiple
> PID's. This is particularly relevant when it comes time to use
> signals, but I imagine has implications beyond this.

It does have implications beyond this, but that is both a feature and a
downside..

The feature, right now, is that external programs _can_ see individual
threads, and can send signals to them individually (and manipulate them
individually in other ways too - ptrace() etc). That is important, and in
a very real way you should think of the "kernel pid" as a global thread
ID. Something that uniquely identifies a thread (not just within a
process, but in general). And such a global thread ID is important for
exactly the reasons mentioned.

The downside is that it looks quite ugly in "ps" listings etc, and there
would certainly be advantages to hiding the "subthreads" in some sense, so
that while the global thread ID _exists_, you don't normally see it. One
way of doing that would be to only show unique VM's in /proc/<pid>/xxx,
and the "subthreads" would show up as /proc/<masterpid>/<subpid>/xxx or
something.

The way these things should work, in my opinion, is that when you
externally send a signal to the "main thread" (the one whose 'pid' the
thread collection sees), that signal gets distributed in the POSIX signal
sense to all the subthreads. But you should still be able to send a signal
to a specific subthread by using _its_ "native pid" aka "thread id".

> > - things like the above are just so much better and more easily done in
> > user space anyway.
>
> While true, there are unfortunately system wide issues to threads
> (such as what I pointed out) which require at least some minimal
> kernel support to make them possible. I think the kernel's clone()
> functionality provides almost all, if not all, of what you need. I
> will look at it more closely (to date I've mostly been looking at the
> userland interface as that's what I actually care about).

Note that when I started doing clone(), I basically said: "this is how I
think threads should be done". I added a few example flags to show the
concept, without really having a firm plan on what the final situation
would be. Some of those flags got expanded upon (CLONE_PARENT is only the
latest addition), while some ended up not being very useful at all
(CLONE_PID is basically useless - the only use for it is to start up the
original idle threads under SMP, and that code is so specialized anyway
that it could basically do the CLONE_PID logic by hand).

There are bound to be more issues. I've seen patches floating around that
expand it, and especially in signal handling SOMETHING has to be done. I
don't think the "share all signal queues" is the right answer: I suspect
the right answer to the signal handling issue is to have a "private" queue
(the regular one) along with a separate method of handling "shared" queues
and a way to attach to a shared signal queue.

Shared signals are potentially useful outside pure threading models too,
and I'm looking for something more generic. I suspect that what I'm
looking for is more like a message list, along with some thin
compatibility code to make it easy for pthreads emulation that looks like
signals..

That's kind of my gripe in general - I think there is a bigger picture
than just plain pthreads. Like clone(), let's do this right.

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:25 EST