Re: [PATCH 1/2] OLPC: Add support for calling into Open Firmware

From: Mitch Bradley
Date: Sun Apr 20 2008 - 14:52:57 EST




Andres Salomon wrote:
On Sun, 20 Apr 2008 08:07:55 -0400
"H. Peter Anvin" <hpa@xxxxxxxxx> wrote:

Yinghai Lu wrote:
On Sat, Apr 19, 2008 at 10:39 AM, Andres Salomon <dilinger@xxxxxxxxxx> wrote:
This adds 32-bit support for calling into OFW from the kernel. It's useful
for querying the firmware for misc hardware information, fetching the device
tree, etc.

There's potentially no reason why other platforms couldn't use this, but
currently OLPC is the main user of it.

This work was originally done by Mitch Bradley.

Hm. This interface seems more than a bit ad hoc. In particular, I *really* don't like the swapper_pg_dir hack.

"There must be a better way."

-hpa

I'm certainly open to suggestions.. Otherwise, I'll poke around and
see if I can come up w/ something.

The x86 architecture doesn't make this problem easy.

The conventional solution is to have the BIOS operate in real mode. When the kernel calls into the BIOS, it has to do a grotesque dance that involves jumping through a chain of several segments of different flavors, thus gradually shutting down the multi-tiered address translation mechanism. Then, if the BIOS is actually operating in protected mode (which is necessary if it is larger than 64K, as all modern BIOSes are), it has to perform the inverse process, do the requested work, then go back into real mode to return to the kernel. The net result is that a "call" into the BIOS involves:

a) Copying the arguments to a real-mode register shadow array
b) Saving all the registers - general ones and a few special ones too
c) Far call to a linear-mapped code segment with an execution address in the first 1M of memory
d) Switching to a different stack
e) Turning off page translation
f) Switching from protected mode to real mode (or in some cases, V86 mode instead, which requires an additional Task State Segment dance to set the IO permission mask)
g) Switching to a real-mode interrupt descriptor table

h) Executing an INT instruction

I) Performing the inverse of a - g inside the BIOS

j) Doing the requested work

K) Performing a - g again to get back into real mode

l) Executing an "iret" instruction

M) Performing the inverse of a-g to return to normal operation

The machinery that you need to do all that is predictably complex - extra segment descriptors that are set up just-so, several little code fragments that must be at special addresses in the first meg, additional stacks, a real-mode interrupt table at a fixed address, and several data save arrays. That machinery has to be in assembly language, spanning several different instruction set modes.

Compared to that, I think that sharing one or two page directory entries at the very top of the virtual address space is pretty clean and simple. With that sharing, the BIOS call is just an ordinary subroutine call. (The setup code copies the entire page directory, but only a couple of entries are actually needed. The reason for copying the whole thing is because it is rather more work to determine the exact number of entries necessary, compared to copying everything and then letting Linux replace the ones it uses.)

Every other solution that I know of requires some sort of heroic dance, either from the OS or from the BIOS or (usually) both.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/