Re: kernel thread support - LWP's

Larry McVoy (lm@bitmover.com)
Thu, 15 Jul 1999 11:36:36 -0600


: At 12:31 AM 7/15/99 -0600, Larry McVoy wrote:
: >Second, for those rare cases where they actually do cost too much,
: >that's only on crappy operating systems. The last time I checked, Linux
: >process context switches were faster than Solaris LWP context switches.
: >So much for that argument.
:
: Since you've obviously talked to a lot of good people on this, I was
: wondering if you could talk about the only issue I haven't heard you bring
: up which is frequently brought up by the LWP/user-thread-scheduler folks.
: What about kernel run-queue length? It seems that I've heard the argument
: made that LWP's keep you from spending a long time in the kernel scheduler,
: which I could see might actually be a good thing.

I just ran some benchmarks to make sure that Linux does the right thing.
I was on Linux 2.2.9, on a 400Mhz Celeron. The context switch time
doesn't vary over the range of 40..200 processes. What this means
is that by the time we have 40 active processes, adding another 40 or
another 160 for that matter, made no difference. The actual numbers are
8 usecs for 0 sized processes (just context switching) and 54 usecs for
processes that also touch 16KB between each context switch.

So once again, on a decent OS, we just aren't seeing any data which
supports the idea that user level is measurably better. Yeah, I'll
bet those user level schedulers can context switch in .1 usecs or
something like that, but given that a cache miss is closer to .2 usecs,
the context switch time is just noise - it's the memory subsystem which
will dominate.

I've heard all these arguments and I've also spent a lot of time thinking
about this because I sort of want to believe that people who I respect
at Sun weren't nuts. But in spite of that, there is no technical evidence
that they weren't - the whole threading thing was just overblown and
they added WAY too much code and WAY too many interfaces to do something
that Rob Pike and Linus managed to do in one interface (rfork() and clone()).
If you ever go look at the Linux code for all this stuff, it's wonderful,
it's like this

clone(... int flags ...)
{
clone_vm(... flags ...);
clone_signals(... flags ...);
clone_pwd(... flags ...);
}

and then each of the resources are like this

clone_vm()
{
if (flags & CLONE_VM) {
vm->refcount++;
return (0);
}
/* otherwise create a new VM and set it up to be COW */
}

It's so damn clean, it's clearly the right way to go about doing this.
I just about fell out of my chair in admiration when I read that code -
say what you well about various parts of the Linux kernel, this part is
just beautiful.

And the icing on the cake is that Linux is tight and small enough that
doing things the right way actually works - you can have just one process
abstraction and it works. The other OS's are giving you two abstractions
because they have slow processes. The Rob Pike quote on this is great:
``If you think you need threads then your processes are too fat.''

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/