Re: RT patch acceptance

From: Ingo Molnar
Date: Fri May 27 2005 - 02:22:26 EST



* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

> Presumably your RT tasks are going to want to do some kind of *real*
> work somewhere along the line - so how is that work provided
> guarantees?

there are several layers to this. The primary guarantee we can offer is
to execute userspace code within N usecs. Most code that needs hard
guarantees is quite simple and uses orthogonal mechanisms.

The secondary guarantee we will eventually offer is that as long as a
process uses orthogonal resources, other tasks will not delay it. This
is not quite true right now but it's possible to achive it
realistically. If an RT and a non-RT task shares e.g. the same file
descriptor - or the RT tasks does not use mlockall() to exclude VM, then
we cannot guarantee latencies. There might be more subtle 'sharing', but
as long as the primary APIs are fundamentally O(1) [and they are], the
worst-case overhead will be deterministic. What controls is the
worst-case latency of the kernel facility an RT task makes use of. Under
normal PREEMPT what controls latencies is the _system-wide_ worst-case
latency - which is a very different thing.

but it's not like hard-RT tasks live in a vacuum: they already have to
be aware of the latencies caused by themselves, and they have to be
consciously aware of what kernel facilities they use. If you do hard-RT
you have to be very aware of every line of code your task may execute.

> So in that sense, if you do hard RT in the Linux kernel, it surely is
> always going to be some subset of operations, dependant on exact
> locking implementation, other tasks running and resource usage, right?

yes. The goal is that latencies will fundamentally depend on what
facilities (and sharing) the RT task makes use of - instead of depending
on what _other_ tasks do in the system.

> Tasklist lock might be a good example off the top of my head - so you
> may be able to send a signal to another process with deterministic
> latency, however that latency might look something like: x + nrproc*y

yes, signals are not O(1).

Fundamentally, the Linux kernel constantly moves towards separation of
unrelated functionality, for scalability reasons. So the moment there's
some unexpected sharing, we try to get to rid of it not primarily due to
latencies, but due to performance. (and vice versa - one reason why it's
not hard to get latency patches into the kernel) E.g. the tasklist lock
might be convered to RCU one day. The idea is that a 'perfectly
scalable' Linux kernel also has perfect latencies - the two goals meet.

> It appears to me (uneducated bystander, remember) that a nanokernel
> running a small hard-rt kernel and Linux together might be "better"
> for people that want real realtime.

If your application model can tolerate a total separation of OSs then
that's sure a viable way. If you want to do everything under one
instance of Linux, and want to separate out some well-controlled RT
functionality, then PREEMPT_RT is good for you.

Note that if you can tolerate separation of OSs (i.e. no sharing or
well-controlled sharing) then you can do that under PREEMPT_RT too, here
and today: e.g. run all the non-RT tasks in an UML or QEMU instance.
(separation of UML needs more work but it's fundamentally ok.) Or you
can use PREEMPT_RT as the nanokernel [although this sure is overkill]
and put all the RT functionality into a virtual machine. So instead of a
hard choice forced upon you, virtualization becomes an option. Soft-RT
applications can morph towards hard-RT conditions and vice versa.

So whether it's good enough will have to be seen - maybe nanokernels
will win in the end. As long as PREEMPT_RT does not impose any undue
design burden on the stock kernel (and i believe it does not) it's a
win-win situation: latency improvements will drive scalability,
scalability improvements will drive latencies, and the code can be
easily removed if it becomes unused.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/