Re: [PATCH] Remove BKL from sysctl(2)

From: Andi Kleen
Date: Thu Jan 24 2008 - 02:22:42 EST


> Yes - that's why I am wondering if we want a general 'sysctl' mutex or
> just to fix the specific case. The corename case is nasty as it is timing
> dependant against user activity rather than the less interesting "root
> can be silly" type of problem.

I think for the strings it would be better to just do a kind of copy-on-write.
As in don't use the array directly, but a pointer instead and then sysctl
allocates a new string, switches the pointer around
and then does a RCU delayed free on the old string if it wasn't in .data.

This would also have the advantage that the small upper limits these
strings currently have are gone and actually save a little memory in
the common case of them not being changed.

Only case that wouldn't fix would be someone keeping the string accessed
over a sleep point. And for preemptible kernels it would need rcu_read_lock().
So no free lunch without code auditing. But it would be probably the best
way to go forward longer term.

For multi number arrays the code ideally needs to be robust against temporary
inconsistencies.

BTW while interesting in theory in practice i suspect the actual likelyhood
of a user actually hitting such a race is pretty small because sysctls tend
to be set up a boot only where not much is going on.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/