Re: Multi Kernel SMP(WAS: Some of your Ideas)

From: Zachary Amsden (zamsden@cthulhu.engr.sgi.com)
Date: Mon Feb 14 2000 - 18:33:32 EST


 
> I am thinking that there may be another protocol that would fit the need
> and could be implemented vitually (tcp/udp) to fill the communication
> need between the individual kernels. But maybe this is the cart before
> the horse sort of thing.
>
> Where do we have to start to get this up and running on an intel
> platform. That is probably most common (although I would bet its the least
> common smp implementation.
>
> The first challenge I see is to get the "intial" kernel to do the system
> analysis and load its sibling.
>
> Thoughts?

If you had a memory architecture where each sibling has local memory, with
it's other siblings local mappable into it's address space, it could probe by
doing memory detection for each possible sibling.

Initial kernel boot could copy it's image to local memory at each sibling,
fill in a "start up data area", and use whatever hardware mechanism to call
the main entry point for the initial processor at each sibling. The start up
data area would have any info needed to boot, i.e. maybe a node number and
hardware list. After this, the "initial" kernel would set up it's own start
up data area, and call it's main kernel entry point.

So this would really be bootstrapping code, and kernel entry and setup
wouldn't really be affected. This would be pretty clean, but I'm assuming a
specialized architecture that helps you quite a bit here.

To do it on generic SMP, you could partition system memory into N segments,
copy the kernel image to each segment, and call the particular entry point on
each sibling. (Each sibling would have a different physical memory entry
point and would need the ability to boot from any specified physical memory
address).

If you can only boot from a fixed physical address, you need to play more
games. You have all siblings but the master re-map their portion of physical
memory to "local" (a fixed address), wait for the masters command (when it
finishes probing hardware are dividing memory). Then they can all jump to
their local entry point, and they all have a uniform memory structure.

Of course with generic SMP, you have a pretty major problem trying to share a
peripheral bus between isolated kernels, unless you divy up hardware between
siblings. When you did this, you put more intelligence behind interrupt
routing. However, each sibling would need it's own disk/network, unless your
sibling communication protocol provides sharing of those resources.

You wouldn't need a communication protocol as heavy as TCP, I would expect
data loss/corruption to be dealt with in hardware for whatever communication
method you are using (likely shared memory).

  VM layout of physical memory mapping for each sibling (this changes a bit
depending on architecture).

|----------------------|
| Sibling i+3 (mod 4) |
| physical memory |
|----------------------|
| Sibling i+2 (mod 4) |
| physical memory |
|----------------------|
| Sibling i+1 (mod 4) |
| physical memory |
|----------------------|
| Local (Sibling i) |
| physical memory |
|----------------------|

The only part of the kernel that needs to be aware of the different sibling
physical mappings is the communication layer itself, so that wouldn't pose to
great of a problem to the kernel.

Some of the things you might want to do with the communication protocol:

cross-sibling UNIX domain sockets/shm
FS sharing
Hardware arbitration/sharing
borrow/return free pages

I think it is feasible to do something like this with the current Linux base.
The bootstrapping/VM initialization will need some major overhaul, but I don't
think the main kernel startup will be affected too drastically.

Just some food for thought.

-- 
Zachary Amsden  zamsden@engr.sgi.com  (650) 933-6919  09U-510  Core Protocols

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 15 2000 - 21:00:28 EST