Re: 2.2.2: 2 thumbs up from lm

Richard Gooch (rgooch@atnf.csiro.au)
Sun, 28 Feb 1999 19:25:03 +1100


yodaiken@chelm.cs.nmt.edu writes:
> > I have no problem with the RT-Linux concept. I've said before I think
> > it's a fine idea. The reaction I've received from people is that it's
> > an incomplete programming environment (i.e. lack of semaphores). I've
>
> Just for the record. Jerry Epplin provided a fine semaphore package
> that has shipped with RTLinux for over a year.

Good. I've passed this on.

> > Where I do agree with the criticisms is that using RT-Linux is hard,
> > because it doesn't fit seamlessly into the Linux/Unix programming
> > environment. You can't take standard POSIX RT code and have the POSIX
> > RT threads be RT-Linux threads. You have to stuff around with loading
> > modules and separating your code into co-routines: separate
> > programmes. I'd like to see this changed.
>
> Me too.

Although I concede it may be very hard to do.

> > Ideally, a thread could ask to become an RT-Linux thread (using
> > sched_setscheduler()). I acknowledge this is hard to achieve, but it's
> > what would make using RT-Linux so much easier. This is important for
> > people who have a certain reluctance in the first place. [*]
>
> The technical problem here is that the thread may want to use libc
> functions that are incompatible with the RT side. For example, I
> can't see any way for a RT thread to safely "malloc".

I've had some private discussions with Larry (he seems to like the
idea), where I scribbled some ideas on how to solve these
problems. The simplest is to just drop RT priority when entering the
kernel. Slightly better is to drop RT priority when hopping onto a
wait queue or calling schedule(). So RT-Linux would provide functions
like:
rt_on_run_queue()
rt_take_off_run_queue()

You'd need to hack the standard wakeup calls to call
rt_put_on_run_queue() as well, but I think it's basically doable.

In essence: you're only on the RT run queue when in user space or
non-blocking in kernel space (i.e. getpid()). If you go into kernel
space, be prepared to squabble for resources like the masses.

That's probably the only sane approach to RT threads entering the
kernel, if you don't want to bloat the kernel with horrible code to
give better RT support. Note that my RT queue patch *doesn't* fall in
the bloat category. See below.

> What I think would be good would be a "run_as_rt_thread( f,period)"
> call that would send a piece of code into the RT world with some
> safeguards at link time to make sure "f" was ok.

That's exactly the kind of thing I'd like to avoid, if at all
possible. It's no longer POSIX, and it requires all kinds of trickery
to ensure the RT "thread" does safe things. Of course, it's better
than the current RT-Linux implementation (no offence), but I'd first
like to see us pursue the path I outlined above. I may be blowing
smoke, but let's first see if it's possible, eh?

> > Well, I've given some of my suggestions. Lets take that further. The
> > changes I proposed to the standard Linux scheduler do 2 things:
> >
> > - improve RT performance by isolating the run queues
>
> Richard. Can you try the following test:
> Run 1 program that does setcheduler and does this loop
> do 1 million times
> rdtsc
> usleep(50000);
> rtdsc and compute difference
> compute average and worst case
> done
> print result
>
> Then run program 2 while 1 is running
> while(1) write(1,buffer,10000000);
>
> Start netscape, run a tar cvf /dev/null /home or something
> o
>
> What does the sched patch do?

I don't need to run it to tell you what will happen. The latency for
the RT thread will be screwed. But, you see, that's *not* a problem I
was trying to solve with the RT queue patch. I took pains to point out
that the RT queue patch would give more deterministic context switch
latencies *under certain conditions* (namely, a friendly system load).
I think this point got lost amidst the flaming.

What the patch does is:

- remove cruft/special casing for RT tasks from the scheduler, making
it more robust and as fast or faster for the general case

- improve the isolation between RT and normal processes, which is good
in general (if you are stuck with a shared RT/luser system) but
*also* good for embedded applications where some threads are RT and
some are not (I gave an example of this: some real code currently
running under pSOS).

The first point is pretty important. Consider why Linus stated the
patch was done the way he "would have done it"? Have a close look at
that patch. Compare the scheduler code before and after. Now tell me
which version is simpler, neater and easier to follow? And less prone
to silly errors?

I didn't just hack up something to add a separate RT run queue. I
looked at the current scheduler and asked myself how it could be done
better. Sorry to harp on about this, but last year I tried again and
again to draw attention to the design and philosophy of the patch, but
never got responses on this. Anyway.

Of course, if we were to remove support for RT threads entirely, that
would *also* clean up the scheduler, which is why I made that radical
suggestion. I thought it would grab some attention ;-)

> > Of course, we have to support the POSIX RT scheduling classes, so just
> > removing them isn't really an option. But suppose instead my (perhaps
> > unrealistic) idea mentioned above (*) was implemented so that a thread
> > which asked for RT scheduling got *real* RT? Now that would be nice.
>
> I think that this is both possible and useful. I'd like to move RT
> out of Linux proper where possible.

OK, but at the moment we seem to have different ideas about what level
of RT support to give. One strong position I have is that we should
continue to support (somehow) the POSIX RT API. We simply can't remove
that, for comaptibility reasons. I also think it's a reasonable
interface, and I'd like to think we can support it in RT-Linux.

So, I see a few possibilities:

1) move POSIX RT support into RT-Linux along the lines of what I
proposed

2) make RT-Linux more accessible along the lines you suggest, and
retain the soft-RT support we currently have

3) neither because it's too hard

4) add POSIX RT support to RT-Linux like (1).

For either (2) or (3), I think it's worth applying my RT queue patch
because it cleans things up and provides some small gains for some
applications.

I would hope that (1) is possible (I haven't yet seen a reason why it
won't work, but I'm probably overlooking something). If not, then we
should go for (2).

If (1) is possible, then we have another choice to make. As I said to
Larry privately, we need to support SCHED_FIFO and SCHED_OTHER
somehow. If we move that to RT-Linux, then we have the problem of
providing support. Do we merge RT-Linux into the official kernel? I
think we're going to see more applications using RT priority (CD
writers, sound and video players, etc.) coming out of the
woodwork. Requiring that people install RT-Linux to run their soft-RT
applications is not a good option, I think.

The problem with merging RT-Linux into Linux is that it may slow down
the general case (a general point Larry is rightly concerned about).
If the Great Maker of Bits is shining on us, the savings made by
taking RT support out of the Linux scheduler will equal or exceed the
costs of adding RT-Linux to Linux. I'm not counting on this, though.

So we come to option (4). Let's define RT priorities from 101-199 as
hard-RT: you get scheduled by the RT-Linux kernel. The existing RT
priorities of 1-99 remain as they are: soft-RT. RT priority 100 is
conceptually the Linux kernel. When a hard-RT thread enters the
kernel, it bares its breast ready to yield itself for the common
good. Sometime later it will be resurrected to its former glory.

Again, for (4), I think it's a good idea to add my RT queue patch.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/