Re: [RFC] observe and act upon workload parallelism: PERF_TYPE_PARALLELISM(Was: [RFC][PATCH] sched_wait_block: wait for blocked threads)

From: Linus Torvalds
Date: Mon Nov 16 2009 - 13:04:12 EST




On Mon, 16 Nov 2009, Ingo Molnar wrote:
>
> Regarding the API and your patch, i think we can and should do something
> different and more capable - while still keeping your basic idea:

Actually, I'd suggest exactly the reverse.

Yes, do something different, but _less_ capable, and much simpler:
introduce the notion of "grouped thread scheduling", where a _group_ of
threads gets scheduled as one thread.

Think of it like a classic user-level threading package, where one process
implements multiple threads entirely in user space, and switches between
them. Except we'd do the exact reverse: create multiple threads in the
kernel, but only run _one_ of them at a time. So as far as the scheduler
is concerned, it acts as just a single thread - except it's a single
thread that has multiple instances associated with it.

And every time the "currently active" thread in that group runs out of CPU
time - or any time it sleeps - we'd just go on to the next thread in the
group.

There are potentially lots of cases where you want to use multiple threads
not because you want multiple CPU's, but because you want to have "another
thread ready" for when one thread sleeps on IO. Or you may use threads as
a container - again, you may not need a lot of CPU, but you split your
single load up into multiple execution contexts just because you had some
independent things going on (think UI threads).

As far as I can tell, that is pretty much what Stijn Devriendt wanted: he
may have lots of threads, but he effectively really just wants "one CPU"
worth of processing.

It's also what we often want with AIO-like threads: it's not that we want
CPU parallelism, and if the data is in caches, we'd like to run the IO
thread immediately and not switch CPU's at all, and actually do it all
synchronously. It's just that _if_ the AIO thread blocks, we'd like to
resume the original thread that may have better things to do.

No "observe CPU parallelism" or anything fancy at all. Just a "don't
consider these X threads to be parallel" flag to clone (or a separate
system call).

Imagine doing async system calls with the equivalent of

- create such an "affine" thread in kernel space
- run the IO in that affine thread - if it runs to completion without
blocking and in a timeslice, never schedule at all.

where these "affine" threads would be even more lightweight than regular
threads, because they don't even act as real scheduling entities, they are
grouped together with the original one.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/