[PATCH v2 1/9] sysctl: add a new register_sysctl_init() interface

From: Luis Chamberlain
Date: Tue Nov 23 2021 - 15:24:46 EST


From: Xiaoming Ni <nixiaoming@xxxxxxxxxx>

The kernel/sysctl.c is a kitchen sink where everyone leaves
their dirty dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to
places where they actually belong. The proc sysctl maintainers
do not want to know what sysctl knobs you wish to add for your own
piece of code, we just care about the core logic.

Today though folks heavily rely on tables on kernel/sysctl.c so
they can easily just extend this table with their needed sysctls.
In order to help users move their sysctls out we need to provide a
helper which can be used during code initialization.

We special-case the initialization use of register_sysctl() since
it *is* safe to fail, given all that sysctls do is provide a dynamic
interface to query or modify at runtime an existing variable. So the
use case of register_sysctl() on init should *not* stop if the sysctls
don't end up getting registered. It would be counter productive to
stop boot if a simple sysctl registration failed.

Provide a helper for init then, and document the recommended init
levels to use for callers of this routine. We will later use this
in subsequent patches to start slimming down kernel/sysctl.c tables
and moving sysctl registration to the code which actually needs
these sysctls.

Signed-off-by: Xiaoming Ni <nixiaoming@xxxxxxxxxx>
Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
[mcgrof: major commit log and documentation rephrasing
also moved to fs/proc/proc_sysctl.c ]
Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---
fs/proc/proc_sysctl.c | 33 +++++++++++++++++++++++++++++++++
include/linux/sysctl.h | 3 +++
2 files changed, 36 insertions(+)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5d66faecd4ef..b4950843d90a 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -16,6 +16,7 @@
#include <linux/module.h>
#include <linux/bpf-cgroup.h>
#include <linux/mount.h>
+#include <linux/kmemleak.h>
#include "internal.h"

static const struct dentry_operations proc_sys_dentry_operations;
@@ -1384,6 +1385,38 @@ struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *tab
}
EXPORT_SYMBOL(register_sysctl);

+/**
+ * __register_sysctl_init() - register sysctl table to path
+ * @path: path name for sysctl base
+ * @table: This is the sysctl table that needs to be registered to the path
+ * @table_name: The name of sysctl table, only used for log printing when
+ * registration fails
+ *
+ * The sysctl interface is used by userspace to query or modify at runtime
+ * a predefined value set on a variable. These variables however have default
+ * values pre-set. Code which depends on these variables will always work even
+ * if register_sysctl() fails. If register_sysctl() fails you'd just loose the
+ * ability to query or modify the sysctls dynamically at run time. Chances of
+ * register_sysctl() failing on init are extremely low, and so for both reasons
+ * this function does not return any error as it is used by initialization code.
+ *
+ * Context: Can only be called after your respective sysctl base path has been
+ * registered. So for instance, most base directories are registered early on
+ * init before init levels are processed through proc_sys_init() and
+ * sysctl_init().
+ */
+void __init __register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name)
+{
+ struct ctl_table_header *hdr = register_sysctl(path, table);
+
+ if (unlikely(!hdr)) {
+ pr_err("failed when register_sysctl %s to %s\n", table_name, path);
+ return;
+ }
+ kmemleak_not_leak(hdr);
+}
+
static char *append_path(const char *path, char *pos, const char *name)
{
int namelen;
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 1fa2b69c6fc3..d3ab7969b6b5 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -199,6 +199,9 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
void unregister_sysctl_table(struct ctl_table_header * table);

extern int sysctl_init(void);
+extern void __register_sysctl_init(const char *path, struct ctl_table *table,
+ const char *table_name);
+#define register_sysctl_init(path, table) __register_sysctl_init(path, table, #table)
void do_sysctl_args(void);

extern int pwrsw_enabled;
--
2.33.0