Re: SCO: "thread creation is about a thousand times faster than on native Linux"

From: Linus Torvalds (torvalds@transmeta.com)
Date: Thu Aug 24 2000 - 11:23:34 EST


On Thu, 24 Aug 2000, Andi Kleen wrote:
>
> Wouldn't it make more sense to extend the current process group concept ?
> A process could be in two groups, the thread group and the process group
> with the pid of the session group leader.

The problem is that I can certainly see a process group leader that is
_also_ a "tgid leader".

Basically, imagine any process group or session leader that wants to be
threaded itself. Now there is a lot of confusion about what "-tgid" means,
as "tgid == pgid" and thus obviously "-tgid == -pgid", making sys_kill()
have a really hard time deciding which to kill.

> > Now, the problem is that the thread group kill thing for true POSIX
> > threads signal behaviour probably has to do some strange magic to get the
> > pthreads signal semantics right. I don't even know the exact details here,
> > so somebody who _really_ knows pthreads needs to look long and hard at
> > this (efficiency here may require that we have a circular list of each
> > "thread ID group" - ie that we add the proper process pointer list that
> > gets updated at fork() and exit() so that we can easily walk every process
> > in the process group list).
>
> POSIX wants to send the signal to the first thread in the group who
> doesn't have it blocked.
>
> Several signals are special cased in POSIX, e.g. SIGSTOP, and need to
> handled by all threads in the group.

Right, which is why we probably do need the list for efficiency.

Note that the "first thread that doesn't block it" decision is the really
nasty one. If all threads have it blocked, we need to have it pending. But
we can only have it pending for _one_ thread. I think POSIX allows this,
but I'm not sure.

> For good behaviour you need a shared sigprocmask(). (I just ran into a
> situation where shared signal blocking would have been very useful on Linux).
> You basically want to protect your data structures that could be accessed
> by signals against signals send to any thread, otherwise sigprocmask
> are pretty useless.

I _really_ really want to avoid this. I think POSIX is vague on the
requirement, and quite frankly, a shared sigprocmask() is a horror. It
really doesn't fit in, at all.

We can add a tgid really easily. The code already exists, and we just need
to extend it in the straightforward way with one more test.

We can add a new process list really easily. Maintaining the circular list
of processes that share the same tgid is trivial. We have all the
infrastructure in place, and it basically becomes something like

 - fork.c:

        INIT_LIST_HEAD(&p->thread_group);
        p->tgid = p->pid;
        if (flags & CLONE_PID) {
                p->tgid = current->pid;
                list_add(&p->thread_group, &current->thread_group);
        }

 - exit.c:

        list_del(&current->thread_group);

and you're pretty much done (the only additional small detail is to make
the above happen in the right place: together with the other list handling
so that it is all protected by the "tasklist_lock" and is SMP-safe).

Basically, we can add thread groups with about 10 lines worth of diffs,
and they will be "obviously correct". That means that if this helps
LinuxThreads, it can happen before 2.4.0-final.

In contrast, a shared sigprocmask() simply isn't going to happen. That is
a 2.5.x issue, and even in 2.5.x I'd really like to avoid it, because it
would be a design mistake, I suspect.

> [Earlier there were proposals to add a CLONE_WAITPID for that, but I think
> controlling it via the tid and prctl would be more elegant and flexible]

I'd prefer CLONE_WAITPID, I think.

> Another thing would be shared credentials. I'm sure there are portd
> programs who have security bugs on Linux because they expect setuid() to be
> process global, and it is local.

Again, this is not going to happen. At least not quickly.

I also think that anybody who depends on global credentials is just buggy.
You _have_ to synchronize anyway, otherwise you have race city with
horrible security issues. Basically, it's a bad idea.

> To solve the problem of system management tools (top) etc. counting a single
> shared mm_struct multiple times [threaded staroffice looks really funny in
> gtop]

I'd rather have this just look at the tgid, and be done with it.

Basically, a shared tgid says: "this is a thread group". Regardless of
whether it shares it's VM space or not. So I think system management tools
should honour that, and not care about CLONE_VM.

Yes, you can make it do strange things, and it means that top would look
nice only after people upgrade to a newer library with newer pthreads()
stuff etc, but hey, I'm interested in the _design_, not in making top look
pleasant right now.

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:13 EST