Re: [PATCH] sysctl: Add a group of macro functions to initcall the sysctl table of each feature

From: Xiaoming Ni
Date: Mon Dec 06 2021 - 22:10:05 EST


On 2021/12/7 9:38, Andrew Morton wrote:
On Tue, 7 Dec 2021 09:13:20 +0800 Xiaoming Ni <nixiaoming@xxxxxxxxxx> wrote:

To avoid duplicated code, add a set of macro functions to initialize the
sysctl table for each feature.

The system initialization process is as follows:

start_kernel () {
...
/* init proc and sysctl base,
* proc_root_init()-->proc_sys_init()-->sysctl_init_bases()
*/
proc_root_init(); /* init proc and sysctl base */
...
arch_call_rest_init();
}

arch_call_rest_init()-->rest_init()-->kernel_init()
kernel_init() {
...
kernel_init_freeable(); /* do all initcalls */
...
do_sysctl_args(); /* Process the sysctl parameter: sysctl.*= */
}

kernel_init_freeable()--->do_basic_setup()-->do_initcalls()
do_initcalls() {
for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++) {
do_initcall_level
}

The sysctl interface of each subfeature should be registered after
sysctl_init_bases() and before do_sysctl_args(). It seems that the sysctl
interface does not depend on initcall_levels. To prevent the sysctl
interface from being initialized before the feature itself. The
lowest-level late_initcall() is used as the common sysctl interface
registration level.

I'm not normally a fan of wrapping commonly-used code sequences into
magical macros, but this one does seem to make sense.

I wonder if it is possible to cook up a checkpatch rule to tell people
to henceforth use the magic macros rather than to open-code things in
the old way. Sounds hard.

--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -943,12 +943,7 @@ static struct ctl_table coredump_sysctls[] = {
{ }
};
-static int __init init_fs_coredump_sysctls(void)
-{
- register_sysctl_init("kernel", coredump_sysctls);
- return 0;
-}
-fs_initcall(init_fs_coredump_sysctls);
+kernel_sysctl_initcall(coredump_sysctls);

But this and several like it are functional changes.

#endif /* CONFIG_SYSCTL */

...

--- a/fs/inode.c
+++ b/fs/inode.c
@@ -132,12 +132,7 @@ static struct ctl_table inodes_sysctls[] = {
{ }
};
-static int __init init_fs_inode_sysctls(void)
-{
- register_sysctl_init("fs", inodes_sysctls);
- return 0;
-}
-early_initcall(init_fs_inode_sysctls);
+fs_sysctl_initcall(inodes_sysctls);
#endif

Here's another, of many.

Someone made the decision to use early_initcall() here (why?) and this
patch switches it to late_initcall()! Worrisome. Each such stealth
conversion should be explained and justified, shouldn't it?


static noinline void __init kernel_init_freeable(void)
{
...
do_pre_smp_initcalls(); /* do early_initcall */
lockup_detector_init();

smp_init();
sched_init_smp();

padata_init();
page_alloc_init_late();
/* Initialize page ext after all struct pages are initialized. */
page_ext_init();

do_basic_setup(); /* do other initcall */
...
}

Between do_pre_smp_initcalls() and do_basic_setup(), no sysctl interface window is configured. In addition, all sysctl data has initial values. Delayed configuration does not affect the behavior after startup.
So I think we can change it to late_initcall().


Thanks
Xiaoming Ni