Re: Should we automatically generate a module signing key at all?

From: Mimi Zohar
Date: Tue May 19 2015 - 11:38:12 EST


On Tue, 2015-05-19 at 07:36 -0700, Andy Lutomirski wrote:
> On Tue, May 19, 2015 at 1:53 AM, David Howells <dhowells@xxxxxxxxxx> wrote:
> > Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> >> I think we should get rid of the idea of automatically generated signing keys
> >> entirely. Instead I think we should generate, at build time, a list of all
> >> the module hashes and link that into vmlinux.
> >
> > Just in Fedora 21:
> >
> > warthog>rpm -ql kernel-modules | grep [.]ko | wc -l
> > 3604
> > warthog>rpm -ql kernel-modules-extra | grep [.]ko | wc -l
> > 480
> >
> > So that's >4000 modules, each signed with a SHA256 sum (32 bytes). That's
> > more than 125K of unswappable memory. And it's uncompressible as Dave pointed
> > out. And that doesn't include any metadata to match a module to a digest, but
> > rather assumes we just scan through the entire list comparing against each
> > SHA256 sum until we find one that matches.
>
> Let's go through the numbers. There are two main things that matter,
> I think: non-swappable memory and disk space. For simplicity and
> because it doesn't really matter, I'll ignore things like the
> filesystem block size.
>
> I'll assume that everyone uses a 256-bit hash. (This is charitable to
> the status quo, since hash size doesn't really matter for public-key
> signatures, and the default is SHA-1.) I'll further assume that there
> are 4096 modules or so.
>
> The current kernel uses 4096-bit RSA. The kernel text needed for
> verification seems to be around 21kB (9kB asymmetric_keys + 12kB MPI).
> The public key is tiny, and the signature is 512 bytes per module.
> (Actually, it's probably more because of PKCS garbage. I'll ignore
> that.) This is a total of ~21kB of non-swappable storage and 2MB of
> disk space for all the signatures.
>
> If the goal were to optimize for size, the kernel should probably use
> a much more compact signature scheme, probably some compressed EC
> signature. Ed25519 is 64 bytes per signature, which seems to be more
> or less optimal. That would reduce disk space used to 64 bytes per
> module or 256kB for 4k modules.
>
> With the hash-based scheme I outlined, the kernel text needed is
> nearly zero. The overhead in each .ko file is zero, and
> module_hashes.ko is 32 bytes per module or 128kB for 4k modules. It
> wins the disk space competition hands down. Naively, though, all of
> that space is non-swappable. Note that any sensible implementation
> would sort the hash list, making hash checks very fast.
>
> One improvement would be to unload module_hashes.ko when you're done
> with it. That's annoying. A different approach would be to use a
> hash tree. For a basic binary hash tree, the root (module_hashes.ko,
> for example) is a single signature, i.e. 32 bytes. (For simplicity,
> we'd store the number of hashes, too. That would add a couple of
> bytes.) Each module needs log2(number of modules) - 1 hashes stored.
> (There's no need for a module to store its own hash, and if the hashes
> are sorted before the hash tree is generated, then the edge directions
> are all implicit.) For 4k modules, that's 11 hashes or 352 bytes per
> module, for a total of 1408kB for 4k modules. The kernel text
> required is almost zero (while efficiently generating hash trees takes
> some thought, verifying them is a very simple loop over the hash
> function). This already beats the status quo in terms of both
> non-swappable memory and disk space. It still loses to Ed25519 or
> similar, though.
>
> As David Woodhouse pointed out, if kmod were changed, most of the
> overhead could go away. kmod could generate the proof at module load
> time. That reduces the total overhead to just the list of hashes.
>
> In summary, I think that the hash scheme does quite well for space
> efficiency, although the comparison is a bit unfair because the
> current code is unnecessarily inefficient.

I'm not sure why you're bringing this up at such a late date. There
was a kernel summit discussion led by Rusty on kernel module
verification. The result of that discussion was to append the signature
to the kernel modules. At the same time, Kees Cook was told to define
an LSM kernel module hook. IMA is on that hook and can be used to
verify kernel modules integrity based on either fiel hashes or
signatures. The choice is yours.

> >
> >> Then, if anyone actually wants to use a public key to verify modules, they can
> >> build the public key into a module as opposed to dragging all of the public
> >> key crud into the main kernel image.
> >
> > A chunk of the 'public key crud' has to be in the kernel for other reasons
> > (the integrity stuff, I think, which has to start before you load any modules)
> > and the public key stuff is used for other things too (such as kexec and may
> > well be used for firmware validation in future) - though that doesn't preclude
> > it being modularised, it does mean that you are likely to load it anyway in
> > future.
>
> What integrity stuff? IIRC dm-verity doesn't use asymmetric crypto at
> all. IMA probably does, though.

IMA can appraise file integrity based on either hashes or signatures.
The difference being that In addition to file integrity, signatures
provides file provenance. Going forward we'd like to see software come
with the associated file signatures.

> For firmware validation, there's no good reason it couldn't work
> exactly like module signatures. Alternatively, firmware validation
> could still use loadable public key crypto. (Again, it could be
> unloaded after boot, which is currently impossible.)
>
> For kexec, I think that the main use is for crash dumps, in which case
> the hash of the crash kernel could be built in. Alternatively, if the
> crash kernel is identical to the original kernel, it would be
> reasonably straightforward to arrange for the kernel to accept itself
> as a valid kexec image.

Kexec is also used to load a different kernel image.

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/