Re: newly released clone() based pthreads package

Stephen C. Tweedie (
Wed, 24 Jul 1996 11:43:25 +0100


On Mon, 22 Jul 1996 18:30:53 -0400 (EDT), joot <>
(Peeter Joot) said:

> I haven't tried out your clone based POSIX threads package yet, but I
> noticed in the README there was a comment about not being able
> to share pid's. I was looking in the kernel source ( 2.0.0 )
> last night, and it seems to me that all you have to do in order
> to share the pid's is set CLONE_PID in the clone flags.
> Have you tried this?

That won't work --- you can't send signals reliably between threads if
you use CLONE_PID, since the signal will just end up being sent to a
single one of the threads at random.

> I haven't really looked into the kernel scheduler code, but was
> wondering about ways to get rid of some of the extra overhead
> incurred context switching between two clone() based threads of the
> same process.

There is very little overhead. Threads share the same mmu context,
and we avoid switching that context when we switch thread as far as

> There are probably some changes that could be made to the kernel to
> lower the overhead of switching between two threads of the same
> process. The one that I can think of is sharing of all thread invariant
> task_struct data.

Please look at the task_struct. :) We already do this!

> When the clone() ( do_fork() ) routine is called a new struct
> task_struct is allocated, and the clone()'ing process's entire
> task_struct is copied. Perhaps it would be a good idea to rework
> task_struct with threading in mind -- have it contain a pointer to a
> structure that contains the process information that is thread
> invariant,

We do. Have a look at linux/include/linux/sched.h; the task_struct
contains things which are thread-specific, sich as pids, signal
information and so on. Shared information, including the file
descriptors, signal handlers (if shared) and mmu context, are
contained in separate structs which are pointed to by the task_struct
and which are shared by all task_structs in the thread group.

> If the overhead between switching such threads could be reduced, then
> perhaps the scheduler could give a thread context switch a higher
> priority than a process context switch since it wouldn't take as much
> work to do. I don't know how feasible this is, however.

Very easy. The "goodness" function in linux/kernel/sched.c already
does some weighting to, for example, give preference to running the
current process over context switching and (on SMP) to try to keep
processes running on the same CPU as far as possible. The lines

/* .. and a slight advantage to the current process */
if (p == prev)
weight += 1;

could easily be changed to

/* .. and a slight advantage to the current mmu context */
if (p->mm == prev->mm)
weight += 1;

Linus --- thoughts? This ought to work OK, and we could even increase
the weight bias without much affecting the fairness of the scheduler
(since, ultimately, we will still be exhausting the counter credit of
all runnable processes before recrediting, so we just change the order
of scheduling the running processes and not the amount of time we
allocate to them).


Stephen Tweedie <>
Department of Computer Science, Edinburgh University, Scotland.