Re: kmod: add a sanity check on module loading

From: Jessica Yu
Date: Tue Jan 03 2017 - 21:54:53 EST


+++ Luis R. Rodriguez [08/12/16 11:49 -0800]:
kmod has an optimization in place whereby if a some kernel code
uses request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is not true for
get_fs_type() though as it uses aliases.

Additionally kmod <= v19 was broken -- it returns 0 to modprobe calls,
assuming the kernel module is built-in, where really we have a race as
the module starts forming. kmod <= v19 has incorrect userspace heuristics,
a userspace kmod fix is available for it:

http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4

This changes kmod to address both:

o Provides the alias optimization for get_fs_type() so modules already
loaded do not get re-requested.

o Provides a sanity test to verify modprobe's work

This is important given how any get_fs_type() users assert success
means we're ready to go, and tests with the new test_kmod stress driver
reveal that request_module() and get_fs_type() might fail for a few
other reasons. You don't need old kmod to fail on request_module() or
get_fs_type(), with the right system setup, these calls *can* fail
today.

Although this does get us in the business of keeping alias maps in
kernel, the the work to support and maintain this is trivial.
Aditionally, since it may be important get_fs_type() should not fail on
certain systems, this tightens things up a bit more.

The TL;DR:

kmod <= v19 will return 0 on modprobe calls if you are built-in,
however its heuristics for checking if you are built-in were broken.

It assumed that having the directory /sys/module/module-name
but not having the file /sys/module/module-name/initstate
is sufficient to assume a module is built-in.

The kernel loads the inittstate attribute *after* it creates the
directory. This is an issue when modprobe returns 0 for kernel calls
which assumes a return of 0 on request_module() can give you the
right to assert the module is loaded and live.

We cannot trust returns of modprobe as 0 in the kernel, we need to
verify that modules are live if modprobe return 0 but only if modules
*are* modules. The kernel heuristic we use to determine if a module is
built-in is that if modprobe returns 0 we know we must be built-in or
a module, but if we are a module clearly we must have a lingering kmod
dangling on our linked list. If there is no modules there we are *somewhat*
certain the module must be built in.

This is not enough though... we cannot easily work around this since the
kernel can use aliases to userspace for modules calls. For instance
fs/namespace.c uses fs-modulename for filesystesms on get_fs_type(), so
these need to be taken into consideration as well.

Using kmod <= 19 will give you a NULL get_fs_type() return even though
the module was loaded... That is a corner case, there are other failures
for request_module() though -- the other failures are not easy to
reproduce though but fortunately we have a stress test driver to help
with that now. Use the following tests:

# tools/testing/selftests/kmod/kmod.sh -t 0008
# tools/testing/selftests/kmod/kmod.sh -t 0009

You can more easily see this error if you have kmod <= v19 installed.

You will need to install kmod <= v19, be sure to install its modprobe
into /sbin/ as by default the 'make install' target does not replace
your own.

This test helps cure test_kmod cases 0008 0009 so enable them.

Reported-by: Martin Wilck <martin.wilck@xxxxxxxx>
Reported-by: Randy Wright <rwright@xxxxxxx>
Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>

Back from travel today, apologies for the delay. Will be able to give
this a proper look this week.

Jessica