Re: [PATCH 1/2] binfmt_elf: FatELF support in the binary loader.

From: Ryan C. Gordon
Date: Fri Oct 23 2009 - 20:20:36 EST



> I have made very similar patch but it's quite small and do not require
> deep hacks.

Wow, competing ideas! :)

Here are my notes on your idea. Ego compels me to prefer my approach, but
I strove to be objective here, as there is a tradeoff of benefits in each
of our approaches.

> It should works with "setarch" too to force selection of binary.

How does setarch work? Does it reorder the file before launching or copy
out one of the ELF records?

If reordering:
What does this do to binaries you can't write to? Regular users couldn't
rewrite /bin/ls before launching, for example.

If copying:
What does this do to programs that rely on the value of argv[0]? If
setarch mangles up argv[0] in its exec*() call to match the original
binary's patch, what does this do to programs that rely on /proc/self/exe?


The most compelling feature of this approach is that a "truearch" binary
(is that the correct name?) could work with any existing Linux system, on
the condition that the architecture you want is the first one in the file.
If you put, say, x86 first in the file and you want to run it on an x86_64
system, you're either out of luck or going to be running the 32-bit
version. In this same scenario, if you put x86_64 first, it just won't run
at all on an unpatched x86 box. So, it's a cool trick, but it's not all
that beneficial. We have to assume that either approach requires kernel
patches to be truly useful. For unpatched boxes, FatELF provides a simple
command line app, fatelf-extract, which can be used to get the original
ELF binary you want out of the FatELF file, both for stripping unwanted
bits and as a measure of last resort if the kernel and dynamic loader
can't handle FatELF. I assume setarch works somewhat the same.

I'm concerned about using the padding bits in e_ident, too. A lot of
manpower went into the ELF specification and I felt it was presumptuous
for me to personally change the format. A container around them, like
FatELF, was a safer, more future-proof choice. I'd rather those that
control the ELF spec decide what those padding bits should be used for in
the future.

The truearch method requires the kernel to seek throughout the whole file
to decide if it can use it at all. FatELF uses the 128 bytes at the front
of the file, which binfmt_elf reads anyhow, and then seeks to the right
record from there, so disk bandwidth overhead is extremely small (one
extra read of 128 bytes if we can use the file, zero extra reads if not).
On the other hand, this approach allows for an unlimited amount of ELF
binaries to reside in a single file below the four gigabyte mark (which is
really, for all intents and purposes, a LOT of binaries). On the other
hand, the FatELF limit of 255 records is probably way more than you could
ever hope to reasonably cram into a file, and if it's not, we can raise it
to 64k (we have reserved bits in the header still). FatELF can store ELF
binaries above 4 gigabytes, unlike truearch, but I'm not sure that's
really ever going to be valuable.

Both approaches have zero disk overhead if a normal ELF file is loaded,
which is good.


In terms of this patch itself, I'd be concerned about using gotos for the
retry_* blocks when a loop would be easy enough to incorporate. I saw you
have a test for personality() that I didn't do; I might have to check into
that, but the binfmt_elf_compat code is definitely catching x86 binaries
on x86_64 here, so I'm not sure it's necessary.

Anyhow, I hope this was useful commentary, and not seen as a battle of
egos. I'm glad to see other approaches, though, as it suggests there
really is a genuine desire for this sort of functionality!

--ryan.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/