From: Albert D. Cahalan (
Date: Thu Aug 24 2000 - 21:34:50 EST

Linux Torvalds writes: [various messages]

> The current sys_kill() logic is not going away. The current "p->pid" is
> not changing. None of this changes existing behaviour in the setup you
> describe.

To reduce confusion, I'd like to change the name.

pid becomes task_id
tgid becomes posix_pid

The terms "pid" and "thread" get eliminated from the kernel.

> (An independent issue is then whether to decide to say "ok, we'll actually
> put the new sys_tgkill() in the same position as the old sys_kill() in the
> system call table, so that old user binaries automatically get the new
> pthreads compatible kill capability". Note that this is also backwards
> compatible, because sys_tgkill() and sys_kill() are actually 100% the same
> as long as CLONE_PID isn't used ;)

This is good to do.

> So basically the only way to trigger the "thread-group-wide" signals would
> be by doing so explicitly. Which we can choose to do on a case-by-case
> basis inside the kernel, of course (so the tty layer may decide to use the
> thread-group version of signal handling, while the SIGIO layer probably
> really should _not_ do that).

The tty layer stuff is defined... this isn't a "may".

> We could make it more dynamic (ie make the exact behaviour a per-signal
> flag or whatever), but that's beyond the scope of any 2.4.x "Let's get
> LinuxThreads working well quickly" kind of discussion.

If signals get fixed for 2.4.x, then this is a trivial extra.

> Oh, the signal handling changes are fine as outlined in "sys_tgkill()".
> That's something simple and straightforward, and has no impact on anything
> but sys_tgkill().
> That is true as long as signals are still a per-thread thing. Which
> appears to be ok by POSIX.
> It's the shared signals stuff that simply cannot happen right now.

Per-thread alone isn't allowed. For 2.5.x at least, there must be
a common queue in addition to the per-task ones.

> HOWEVER, I suspect that pthreads compatibility with signals may require us
> to have that thread root process (even if it isn't used for anything
> else), because I think that makes our signal handling be POSIX-conformant:
> if I remember correctly POSIX does allow the notion of having signals
> handled in a special thread that doesn't do anything else.

This might not need kernel support.

>> POSIX wants to send the signal to the first thread in the group who
>> doesn't have it blocked.
>> Several signals are special cased in POSIX, e.g. SIGSTOP, and need to
>> handled by all threads in the group.
> Right, which is why we probably do need the list for efficiency.
> Note that the "first thread that doesn't block it" decision is the really
> nasty one. If all threads have it blocked, we need to have it pending. But
> we can only have it pending for _one_ thread. I think POSIX allows this,
> but I'm not sure.

Nope, the first thread to unblock will get the signal. If this gets
too messy it can be saved for 2.5.x of course.

>> For good behaviour you need a shared sigprocmask(). (I just
>> ran into a situation where shared signal blocking would have
>> been very useful on Linux). You basically want to protect your
>> data structures that could be accessed by signals against
>> signals send to any thread, otherwise sigprocmask are pretty
>> useless.
> I _really_ really want to avoid this. I think POSIX is vague
> on the requirement, and quite frankly, a shared sigprocmask()
> is a horror. It really doesn't fit in, at all.

>From a quick glance at SUSv2, I'd say n+1 are required.
One per thread, plus another one for the process.

I'll have to study the SUSv3 draft a bit to be sure.

>> Another thing would be shared credentials. I'm sure there are portd
>> programs who have security bugs on Linux because they expect setuid()
>> to be process global, and it is local.
> Again, this is not going to happen. At least not quickly.

Truly shared, with locking and all, would be a problem...
but how about pushing out changes to all the other tast structs?
This ought to be SMP-friendly I think.

>> To solve the problem of system management tools (top) etc.
>> counting a single shared mm_struct multiple times [threaded
>> staroffice looks really funny in gtop]
> I'd rather have this just look at the tgid, and be done with it.

That isn't quite enough, because these tools need to know if they
should be adding up numbers or not.

Example: you have four tasks and two address spaces of 80 MB each.
The "top" command should display that 160 MB is in use.

VM identifiers and/or pre-calculated values must be supplied.
I suppose you'd accept whatever can be done without locking?

> Acceptable solution:
> - add "tgid" (thread group ID) to "struct task_struct"
> - CLONE_PID will leave tgid unchanged.
> - non-CLONE_PID will set "tgid" to the same as pid
> - get_pid() checks that the new pid is not the tgid of any process.

OK. Existing kernel users of CLONE_PID get CLONE_SMP instead?
(that is, they get their own bit)

Might it be OK to severely change get_pid() for speed?
I'm thinking that a per-CPU pool of free PIDs would be good,
with free bits migrating in large blocks as needed. Also it
might be nice to have (pid & SLOT_MASK) be usable to look up
a pid number.

> [...] I can't say that I've seen any really convincing arguments
> for "performance" and "Java" yet.

The arguments for "performance" and "Transmeta" should apply.
An OS with good Java support would share translated code across
users and would not give Java processes unique address spaces.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:15 EST