Re: Slow pthread_create() under high load

From: Albert D. Cahalan (acahalan@cs.uml.edu)
Date: Thu Mar 30 2000 - 04:13:24 EST


Linus Torvalds writes:
> On Wed, 29 Mar 2000, Christopher Smith wrote:

> The downside is that it looks quite ugly in "ps" listings etc, and there
> would certainly be advantages to hiding the "subthreads" in some sense, so
> that while the global thread ID _exists_, you don't normally see it. One
> way of doing that would be to only show unique VM's in /proc/<pid>/xxx,
> and the "subthreads" would show up as /proc/<masterpid>/<subpid>/xxx or
> something.

I don't think "unique VM's" is right for Linux. Tasks that share
files are related just as much as those that share memory.

I'll be glad to fix "ps".

Would you accept a patch that assigned a new XID for every new
clone(0) child and also for every exec when the XID is shared?
This would let "ps" handle both POSIX and non-POSIX threads.

Would you accept such a patch that used a very large ID space?
If the XID were to suffer from wrap-around, "ps" would suffer
from a dangerous race condition that would make unrelated tasks
get shown as one process. I think I'd use a separate 64-bit
counter for each CPU, initialized to (this_cpu_number<<48).

Would you accept a patch that exposed identifiers for every
shareable part of a task? (kernel pointer values would work)
Example: a group of 8 related tasks might have 5 address
spaces total, which "ps" must use to determine total memory.

Would you accept a patch that made /proc listings output all
related tasks (those with the same XID) one after the other?
Without this, "ps" performance on large systems drops greatly.
(note that "ps" is unsorted by default for this reason)

Would you accept a patch that, for any group of related tasks,
prevented reuse of the initial PID by an unrelated task?
While a well-behaved thread package would not let the initial
task exit early, an evil thread package may well seek to
confuse the system administrator.

Would you accept a patch that prevented such reuse by returning
error codes?

Would you accept a patch that prevented such reuse by turning the
initial task into a zombie that waits for all related tasks
before it itself can be waited for?

Would you accept a patch that prevented such reuse by ripping
apart the task group, assigning new XID values to every task
in the group? (thus causing messy "ps" output again)

Would you accept a patch that prevented such reuse by simply
sending SIGKILL to any related task?

> The way these things should work, in my opinion, is that when you
> externally send a signal to the "main thread" (the one whose 'pid' the
> thread collection sees), that signal gets distributed in the POSIX signal
> sense to all the subthreads. But you should still be able to send a signal
> to a specific subthread by using _its_ "native pid" aka "thread id".

Would you accept a patch that implemented this ability by adding
a second kill() system call, perhaps with a scope flag?

Would you accept a patch that allowed kill-by-XID, so that the
PID wrap-around race condition would be eliminated? This could
be unrelated to the above, or it could be just a different scope.

> There are bound to be more issues. I've seen patches floating around that
> expand it, and especially in signal handling SOMETHING has to be done. I
> don't think the "share all signal queues" is the right answer: I suspect
> the right answer to the signal handling issue is to have a "private" queue
> (the regular one) along with a separate method of handling "shared" queues
> and a way to attach to a shared signal queue.
>
> Shared signals are potentially useful outside pure threading models too,
> and I'm looking for something more generic. I suspect that what I'm
> looking for is more like a message list, along with some thin
> compatibility code to make it easy for pthreads emulation that looks like
> signals..

Eh, not async SysV message queues I hope.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:26 EST