Re: dynamic sysctl registration (pre2.0).4

Stephen C. Tweedie (sct@dcs.ed.ac.uk)
Fri, 17 May 1996 23:38:06 +0100


Hi,

On Fri, 17 May 96 0:13:36 EDT, Tom Dyas <tdyas@eden.rutgers.edu> said:

> The following patch adds support in the sysctl system for dynamically
> adding and removing sysctl entries from existing sysctl tables.

Umm, take another look at the code --- linux has supported this since
1.3.59. :)

> It is fundamental redesign of how sysctl's are stored.

No such redesign is necessary!!!

> They are now stored as linked lists much as the procfs directory
> entries are stored.

The existing code already stores them in linked lists. However, the
way we do it is to maintain a linked list of separate entire table
hierarchies, so that whole sets of sysctl entries may be registered
and deregistered at once.

> This support is needed because modules such as binfmt_java may want
> to register sysctl's in existing tables. The current system does not
> allow this at all.

Have a look at the "register_sysctl_table" and
"unregister_sysctl_table" functions. The register function takes a
sysctl hierarchy as argument, and adds it to the list of available
tables. Any sysctl() request will be matched against each table on
the list until a match is found.

The register function also inserts all of the entries on the list into
the /proc/sys filesystem. However, there is a problem with this in
current kernels --- we don't cleanly handle the case where a directory
in a new table overlaps an existing directory. Patch 1 below fixes
this, as well as fixing a memory leak when we deregister sysctl tables
and adding some necessary auxiliary functions to the ksyms exported
symbol table.

Patch 2 below updates binfmt_java to add /proc/sys/kernel/java*
entries dynamically when inserted as a module; it's a good example of
how to use this feature.

> Some minor typos in include/asm-i386/unistd.h are corrected. Just some
> extra underscores in the syscall number macros which prevented me from
> using _syscall1 to make a sysctl syscall function the first time
> around.

This is also wrong. The underscores are there for a very good reason.

BSD-4.4 specifies the sysctl() function as

int sysctl (int *name, int nlen,
void *oldval, size_t *oldlenp,
void *newval, size_t newlen);

However, the i386 unistd.h doesn't allow us to pass more than 5
arguments into a system call. To overcome this, we define the actual
system call interface differently:

int sys_sysctl(struct __sysctl_args *args);

In the libc (or wherever else we want to link in the system call), we
define two functions: sysctl(), which is a library function, takes the
full argument list and packs the args into a struct. This struct is
is then passed into the second function, _sysctl(), which is the real
kernel function. The extra underscore in unistd.h is there to allow
us access to _sysctl(), which is in the kernel, rather than sysctl(),
which is not.

The actual syscall interface should look something like:

>>>>
#include <linux/unistd.h>
#include <linux/types.h>
#include <linux/sysctl.h>

_syscall1(int, _sysctl, struct __sysctl_args *, args);

int sysctl (int *name, int nlen, void *oldval, size_t *oldlenp,
void *newval, size_t newlen)
{
struct __sysctl_args args = {name, nlen, oldval, oldlenp,
newval, newlen};
return _sysctl(&args);
}
<<<<

The patches below are against 1.99.4. With the new kernel, I can do:

>>>>
[root@dax modules]# ls /proc/sys/kernel
domainname file-nr inode-max osrelease panic version
file-max hostname inode-nr ostype securelevel
[root@dax modules]# insmod binfmt_java.o
[root@dax modules]# ls /proc/sys/kernel/
domainname inode-max osrelease version
file-max inode-nr ostype
file-nr java-appletviewer panic
hostname java-interpreter securelevel
[root@dax modules]# rmmod binfmt_java
[root@dax modules]# ls /proc/sys/kernel/
domainname file-nr inode-max osrelease panic version
file-max hostname inode-nr ostype securelevel
[root@dax modules]#
<<<<

Cheers,
Stephen.

--
Stephen Tweedie <sct@dcs.ed.ac.uk>
Department of Computer Science, Edinburgh University, Scotland.

Patch 1: fixup sysctl ---------------------------------------------------------------- Index: linux/kernel/sysctl.c =================================================================== RCS file: /home/rcs/CVS/linux/kernel/sysctl.c,v retrieving revision 1.6 diff -u -r1.6 sysctl.c --- sysctl.c 1996/05/15 13:13:32 1.6 +++ sysctl.c 1996/05/17 22:10:59 @@ -384,6 +386,7 @@ #ifdef CONFIG_PROC_FS unregister_proc_table(table->ctl_table, &proc_sys_root); #endif + kfree(table); } /* @@ -396,8 +399,10 @@ static void register_proc_table(ctl_table * table, struct proc_dir_entry *root) { struct proc_dir_entry *de; + int exists; for (; table->ctl_name; table++) { + exists = 0; /* Can't do anything without a proc name. */ if (!table->procname) continue; @@ -426,12 +431,30 @@ } /* Otherwise it's a subdir */ else { - de->ops = &proc_dir_inode_operations; - de->nlink++; - de->mode |= S_IFDIR; + /* First things first --- does the subdirectory already + exist? */ + struct proc_dir_entry *tmp; + + for (tmp = root->subdir; tmp; tmp = tmp->next) { + if (de->namelen == tmp->namelen && + !strncmp(de->name, tmp->name, tmp->namelen) + ) { + exists = 1; + break; + } + } + if (exists) { + kfree(de); + de = tmp; + } else { + de->ops = &proc_dir_inode_operations; + de->nlink++; + de->mode |= S_IFDIR; + } } table->de = de; - proc_register_dynamic(root, de); + if (!exists) + proc_register_dynamic(root, de); if (de->mode & S_IFDIR ) register_proc_table(table->child, de); } @@ -449,6 +472,10 @@ continue; } unregister_proc_table(table->child, de); + /* If this part of the /proc tree still contains + entries, don't delete it! */ + if (de->subdir) + continue; } proc_unregister(root, de->low_ino); kfree(de); Index: linux/kernel/ksyms.c =================================================================== RCS file: /home/rcs/CVS/linux/kernel/ksyms.c,v retrieving revision 1.7 diff -u -r1.7 ksyms.c --- ksyms.c 1996/05/15 13:13:31 1.7 +++ ksyms.c 1996/05/17 19:44:10 @@ -230,6 +230,11 @@ /* sysctl table registration */ X(register_sysctl_table), X(unregister_sysctl_table), + X(proc_dostring), + X(proc_dointvec), + X(proc_dointvec_minmax), + X(sysctl_string), + X(sysctl_intvec), /* interrupt handling */ X(request_irq), ----------------------------------------------------------------

Patch 2: add dynamic /proc/sys/kernel/java* ---------------------------------------------------------------- Index: linux/fs/binfmt_java.c =================================================================== RCS file: /home/rcs/CVS/linux/fs/binfmt_java.c,v retrieving revision 1.1 diff -u -r1.1 binfmt_java.c --- binfmt_java.c 1996/05/15 13:12:19 1.1 +++ binfmt_java.c 1996/05/17 21:04:17 @@ -19,6 +19,26 @@ char binfmt_java_interpreter[65] = _PATH_JAVA; char binfmt_java_appletviewer[65] = _PATH_APPLET; +#ifdef MODULE +#include <linux/sysctl.h> +static struct ctl_table_header *java_sysctl_header; +static ctl_table java_root_table[]; +static ctl_table java_kernel_table[]; + +static ctl_table java_root_table[] = { + {CTL_KERN, "kernel", NULL, 0, 0555, java_kern_table}, + {0} +}; + +static ctl_table java_kern_table[] = { + {KERN_JAVA_INTERPRETER, "java-interpreter", binfmt_java_interpreter, + 64, 0644, NULL, &proc_dostring, &sysctl_string }, + {KERN_JAVA_APPLETVIEWER, "java-appletviewer", binfmt_java_appletviewer, + 64, 0644, NULL, &proc_dostring, &sysctl_string }, + {0} +}; +#endif + static int do_load_script(struct linux_binprm *bprm,struct pt_regs *regs) { char *cp, *interp, *i_name; @@ -179,11 +199,13 @@ #ifdef MODULE int init_module(void) { + java_sysctl_header = register_sysctl_table(java_root_table, 0); return init_java_binfmt(); } void cleanup_module( void) { printk(KERN_INFO "Removing JAVA Binary support...\n"); + unregister_sysctl_table(java_sysctl_header); unregister_binfmt(&java_format); unregister_binfmt(&applet_format); } ----------------------------------------------------------------