[ANNOUNCE] Adeos nanokernel for Linux kernel

From: Karim Yaghmour (karim@opersys.com)
Date: Mon Jun 03 2002 - 03:35:04 EST


We have released the initial implementation of the Adeos nanokernel.
The following is a complete description of its background, its
implementation, its API, and its potential uses. Please also see the
press release (http://www.freesoftware.fsf.org/adeos/pr-2002-06-03.en.txt)
and the project's workspace (http://freesoftware.fsf.org/projects/adeos/).
The Adeos code is distributed under the GNU GPL.

The Adeos nanokernel is based on research and publications made in the
early '90s on the subject of nanokernels. Our basic method was to
reverse the approach described in most of the papers on the subject.
Instead of first building the nanokernel and then building the client
OSes, we started from a live and known-to-be-functional OS, Linux, and
inserted a nanokernel beneath it. Starting from Adeos, other client
OSes can now be put side-by-side with the Linux kernel.

To this end, Adeos enables multiple domains to exist simultaneously on
the same hardware. None of these domains see each other, but all of
them see Adeos. A domain is most probably a complete OS, but there is
no assumption being made regarding the sophistication of what's in
a domain.

To share the hardware among the different OSes, Adeos implements an
interrupt pipeline (ipipe). Every OS domain has an entry in the ipipe.
Each interrupt that comes in the ipipe is passed on to every domain
in the ipipe. Instead of disabling/enabling interrupts, each domain
in the pipeline only needs to stall/unstall his pipeline stage. If
an ipipe stage is stalled, then the interrupts do not progress in the
ipipe until that stage has been unstalled. Each stage of the ipipe
can, of course, decide to do a number of things with an interrupt.
Among other things, it can decide that it's the last recipient of the
interrupt. In that case, the ipipe does not propagate the interrupt
to the rest of the domains in the ipipe.

Regardless of the operations being done in the ipipe, the Adeos code
does __not__ play with the interrupt masks. The only case where the
hardware masks are altered is during the addition/removal of a domain
from the ipipe. This also means that no OS is allowed to use the real
hardware cli/sti. But this is OK, since the stall/unstall calls
achieve the same functionality.

Our approach is based on the following papers (links to these
papers are provided at the bottom of this message):
[1] D. Probert, J. Bruno, and M. Karzaorman. "Space: a new approach to
operating system abstraction." In: International Workshop on Object
Orientation in Operating Systems, pages 133-137, October 1991.
[2] D. Probert, J. Bruno. "Building fundamentally extensible application-
specific operating systems in Space", March 1995.
[3] D. Cheriton, K. Duda. "A caching model of operating system kernel
functionality". In: Proc. Symp. on Operating Systems Design and
Implementation, pages 179-194, Monterey CA (USA), 1994.
[4] D. Engler, M. Kaashoek, and J. O'Toole Jr. "Exokernel: an operating
system architecture for application-specific resource management",
December 1995.

If you don't want to go fetch the complete papers, here's a summary.
The first 2 discuss the Space nanokernel, the 3rd discussed the cache
nanokernel, and the last discusses exokernel.

The complete Adeos approach has been thoroughly documented in a whitepaper
published more than a year ago entitled "Adaptive Domain Environment
for Operating Systems" and available here: http://www.opersys.com/adeos
The current implementation is slightly different. Mainly, we do not
implement the functionality to move Linux out of ring 0. Although of
interest, this approach is not very portable.

Instead, our patch taps right into Linux's main source of control
over the hardware, the interrupt dispatching code, and inserts an
interrupt pipeline which can then serve all the nanokernel's clients,
including Linux.

This is not a novelty in itself. Other OSes have been modified in such
a way for a wide range of purposes. One of the most interesting
examples is described by Stodolsky, Chen, and Bershad in a paper
entitled "Fast Interrupt Priority Management in Operating System
Kernels" published in 1993 as part of the Usenix Microkernels and
Other Kernel Architectures Symposium. In that case, cli/sti were
replaced by virtual cli/sti which did not modify the real interrupt
mask in any way. Instead, interrupts were defered and delivered to
the OS upon a call to the virtualized sti.

Mainly, this resulted in increased performance for the OS. Although
we haven't done any measurements on Linux's interrupt handling
performance with Adeos, our nanokernel includes by definition the
code implementing the technique described in the abovementioned
Stodolsky paper, which we use to redirect the hardware interrupt flow
to the pipeline.

In terms of implementation, the Adeos' code is rather short since we
focused on setting the foundations for sharing interrupts between
domains. Here are the files we added:
kernel/adeos.c: Architecture-independent domain code.
arch/i386/kernel/adeos.c: Architecture-dependent domain code.
arch/i386/kernel/ipipe.c: Interrupt pipeline code.
include/asm-i386/adeos.h: Arch-dependent Adeos header.
include/linux/adeos.h: Main (arch-independent) Adeos header.

As you can see, only the i386 is currently supported. Nonetheless,
most of the architecture-dependent code is easily portable to other
architectures.

We also modified some files to tap into Linux interrupt dispatching
(all the modifications are encapsulated in #ifdef CONFIG_ADEOS/#endif):
        kernel/ksyms.c
        arch/i386/kernel/apic.c
        arch/i386/kernel/entry.S
        arch/i386/kernel/i386_ksyms.c
        arch/i386/kernel/i8259.c
        arch/i386/kernel/irq.c
        arch/i386/kernel/smp.c
        arch/i386/kernel/time.c
        arch/i386/kernel/traps.c
        arch/i386/mm/fault.c
        include/asm-i386/keyboard.h
        include/asm-i386/system.h

We modified the idle task so it gives control back to Adeos in order for
the ipipe to continue propagation:
arch/i386/kernel/process.c

We modified init/main.c to initialize Adeos very early in the startup.

Of course, we also added the appropriate makefile modifications and
config options so that you can choose to enable/disable Adeos as
part of the kernel build configuration.

Here is Adeos' public API:

    int adeos_register_domain(adomain_t *adp, adattr_t *attr);
    Register a new domain using the properties defined in the attribute.

    void adeos_unregister_domain(adomain_t *adp);
    Remove "adp" domain.

    void adeos_renice_domain(adomain_t *adp, int newpri);
    Change "adp"'s priority in the ipipe.

    void adeos_suspend_domain(void);
    This domain is done dealing with the current interrupt. This signals
    the ipipe to provide the interrupt to the next ipipe stage.

    void adeos_virtualize_irq(unsigned irq, void (*handler)(void),
    int (*acknowledge)(unsigned));
    Provide a handler and an acknowledgment function for "irq".

    void adeos_control_irq(unsigned irq, unsigned clrmask, unsigned
        setmask);
    Change the current domain's handling mask for irq "irq". "clrmask"
    is applied first and then "setmask" is applied.

    void adeos_stall_ipipe(void);
    Stall the current domain's ipipe stage. Alternative to cli.

    void adeos_unstall_ipipe(void);
    Unstall the current domain's ipipe stage. Alternative to sti.

    void adeos_restore_ipipe(unsigned x);
    Restore the ipipe from its saved state. Alternative to
    __restore_flags() and local_irq_restore(). This is used with the
    following defines:
    #define adeos_test_ipipe() \
        test_bit(IPIPE_STALL_FLAG,&adp_current->status)
    #define adeos_test_and_stall_ipipe() \
        test_and_set_bit(IPIPE_STALL_FLAG,&adp_current->status)
    Which replace __save_flags and local_irq_save(), respectively.

In Linux's case, adeos_register_domain() is called very early during
system startup. set_intr_gate() in arch/i386/kernel/traps.c is then
modified to call on adeos_virtualize_irq() so that Linux would tell
the ipipe that it needs the irq passed to set_intr_gate().

To add your domain to the ipipe, you need to:
1) Register your domain with Adeos using adeos_register_domain()
2) Call adeos_virtualize_irq() for all the IRQs you wish to be
notified about in the ipipe.

That's it. Provided you gave Adeos appropriate handlers in step
#2, your interrupts will be delivered via the ipipe.

During runtime, you may change your position in the ipipe using
adeos_renice_domain(). You may also stall/unstall the pipeline
and change the ipipe's handling of the interrupts according to your
needs.

We currently don't support SMP, but we do have APIC support on UP.

Here are some of the possible uses for Adeos (this list is far
from complete):
1) Much like User-Mode Linux, it should now be possible to have 2
Linux kernels living side-by-side on the same hardware. In contrast
to UML, this would not be 2 kernels one ontop of the other, but
really side-by-side. Since Linux can be told at boot time to use
only one portion of the available RAM, on a 128MB machine this
would mean that the first could be made to use the 0-64MB space and
the second would use the 64-128MB space. We realize that many
modifications are required. Among other things, one of the 2 kernels
will not need to conduct hardware initialization. Nevertheless, this
possibility should be studied closer.

2) It follows from #1 that adding other kernels beside Linux should
be feasible. BSD is a prime candidate, but it would also be nice to
see what virtualizers such as VMWare and Plex86 could do with Adeos.
Proprietary operating systems could potentially also be accomodated.

3) All the previous work that has been done on nanokernels should now
be easily ported to Linux. Mainly, we would be very interested to
hear about extensions to Adeos. Primarily, we have no mechanisms
currently enabling multiple domains to share information. The papers
mentioned earlier provide such mechanisms, but we'd like to see
actual practical examples.

4) By incorporating Adeos into the main kernel tree (I know my inbox
is probably going to fill up because of this one), kernel debuggers'
main problem (tapping into the kernel's interrupts) is solved and it
should then be possible to provide patchless kernel debuggers. They
would then become loadable kernel modules.

5) Drivers who require absolute priority and dislike other kernel
portions who use cli/sti can now create a domain of their own
and place themselves before Linux in the ipipe. This provides a
mechanism for the implementation of systems that can provide guaranteed
realtime response.

Of course, we are interested in hearing about comments and suggestions
you have about Adeos.

Best regards,

Philippe Gerum
Karim Yaghmour

----------------------------------------------------------------------
Links to papers:
1-
http://citeseer.nj.nec.com/probert91space.html
ftp://ftp.cs.ucsb.edu/pub/papers/space/iwooos91.ps.gz (not working)
http://www4.informatik.uni-erlangen.de/~tsthiel/Papers/Space-iwooos91.ps.gz

2-
http://www.cs.ucsb.edu/research/trcs/abstracts/1995-06.shtml
http://www4.informatik.uni-erlangen.de/~tsthiel/Papers/Space-trcs95-06.ps.gz

3-
http://citeseer.nj.nec.com/kenneth94caching.html
http://guir.cs.berkeley.edu/projects/osprelims/papers/cachmodel-OSkernel.ps.gz

4-
http://citeseer.nj.nec.com/engler95exokernel.html
ftp://ftp.cag.lcs.mit.edu/multiscale/exokernel.ps.Z
----------------------------------------------------------------------


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 07 2002 - 22:00:14 EST