Re: Kernel support for peer-to-peer protection models...

From: Ivan Godard
Date: Sun Mar 28 2004 - 15:52:05 EST



----- Original Message -----
From: "Andi Kleen" <ak@xxxxxxx>
To: "Ivan Godard" <igodard@xxxxxxxxxxx>
Cc: <linux-kernel@xxxxxxxxxxxxxxx>
Sent: Friday, March 26, 2004 10:29 PM
Subject: Re: Kernel support for peer-to-peer protection models...


> "Ivan Godard" <igodard@xxxxxxxxxxx> writes:
>
> > We're a processor startup with a new architecture that we will be
porting
> > Linux to. The bulk of the port will be straightforward (well, you know
what
> > I mean), except for the protection model supported by the hardware. How
> > would you extend/mod the kernel if you had hardware that:
> >
> > 1) had a large number of distinguishable address spaces
>
> Large or unlimited? If not unlimited you may still run into
> problems when you give each process such an address space.
> Limiting the number of processes is probably not an option.

Large but not unlimited - tens of thousands. Think PIDs. The number of
threads (sharing common address spaces) is not limited.

> > 2) any running code had two of these (code and data environment) it
could
> > use arbitrarily, but access to addresses in others was arbitrarily
protected
> > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > inter-space pointers, have the same representation in all spaces
>
> You mean Address 0 is only accessible from Address space 0, but not
> from Space 1 ?

A field in a (64 bit) pointer gives the address space number. Each space has
an "address 0" which is actually the zeroth byte *in that space* - an actual
pointer is a space number and an index in that space. You (i.e. a process)
run with a native space that determines your priveledges. If you are running
as space 0 then you may or may not be able to address space 1 - it depends
on the access you inherited or have obtained. Think mmap. You (almost)
always can address all your native space, although there might not be
anything there.

> Maybe you can give each process an different address range, but AFAIK
> the only people who have done this before are users of non MMU
architectures.
> It will probably require som changes in the portable part of the code.
> Also porting glibc's ld.so to this will be likely no-fun.

Each process gets a different range because each process gets a different
native space. Within that space processes can use the same offsets, and
typically will so as to avoid pointless relocation.

> > 4) no "supervisor mode"
> > 5) inter-space references require grant of access (transitive) by the
> > accessed space; grants can be entire space or any contiguous subspace
>
> Sounds like the only sane way to handle (4) would be to give the kernel
> an own address space with the necessary grants to access everything.
> However this will require an address space switch for every system call.
> But there is no way around it, linux requires a "shared" kernel mapping
> at least for part of the kernel memory ("lowmem")

We could (trivially) emulate a monolithic kernel in a single space. But that
loses the reliability improvement available if the kernel subsystens ran in
their own spaces with grants of access to those common structures they
individually needed. BTW, there's nothing to be gained by minimizing address
switches - it's in hardware, and inter-space references and calls run at the
same speed as same-space references and calls.

> Overall it sounds like your architecture is not very well suited to
> run Linux.

We believe we can adopt the Linux protection model (i.e. the 386 protection
model) with no more work than any other port to a new architectire (ahem).
But the result would also be as prone to bugs and exploits as a 386 too.

> > 10) Drivers can have their own individual space(s) distinct from those
of
> > the kernel and the apps. Buggy drivers cannot crash the kernel.
>
> At least you would need to use your own drivers (I believe the IBM
> iSeries and s390/VM port does it kind of). If your CPU has generic PCI
> slots this will be a lot of work. Without it it will be lots of work too,
> but at least the number of drivers required is limited.

So long as 1) a driver has a driver-load-time defined region of working data
space; 2) has a defined code region; 3) gets its buffer addresses etc. as
arguments; 4) calls defined OS APIs; and 5) never touches anything except
its private code and data, its arguments, and syscalls then it can run in
one of our protected environments and be none the wiser. That is, if the
driver has been coded to look like a well behaved server process then all is
well. If it has hard references to shared kernel data structures then it
will break, because those shared spaces are not visible and must be accessed
through a service call to someone who owns that structure.

> > Is this model so alien to the existing Kernel that the best approach is
to
>
> It is definitely alien.

You don't know the half of it! This is just the protection model :-)

Ivan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/