[rfc] the kernel workflow & trivial "global -> static" patches(was: Re: [2.6 patch] make sched_feat_{names,open} static)

From: Ingo Molnar
Date: Mon May 05 2008 - 16:19:45 EST



* Adrian Bunk <bunk@xxxxxxxxxx> wrote:

> This patch makes the following needlessly global code in kernel/sched.c
> static:
> - sched_feat_names[]
> - sched_feat_open()

Adrian, you've been doing these "make needlessly global code static"
patches for years meanwhile. I'm asking whether you have made any
thoughts about going to the next level and automating (or at least
half-automating) the generation of these trivial patches, so that they
become less disruptive to our patch flow?

My opinion is that the technical benefit of these patches is positive to
Linux: they make the kernel image a tiny bit smaller - for example this
sched.c patch of yours i'm replying to makes the 64-bit kernel 8 bytes
smaller. So i always applied your patches, and will do so in the future
too.

But the per patch benefit is arguably extremely low: for example this
particular patch saves 0.00016% from my particular vmlinux's size. Even
on a Linux-Tiny kernel config, it's just 0.0008% of the vmlinux's size.
We'd need more than 600 such patches (!) to make just 0.1% of a code
size difference to my vmlinux, and even 0.1% is still very, very small.
[ and many of these patches do not even trigger any savings on tiny
configs. ]

The point that many kernel developers make is that if we compare this
small benefit to the disruption to the workflow this many small patches
can cause, they could in fact become a net negative contribution (!). So
it is extremely important to reduce the disruption, as much as we can.

And unlike other kernel developers, my opinion is not that we should
eliminate this disruption by not doing these trivial patches _at all_.
My opinion is that we should make it easier for maintainers to _avoid
introducing_ these problems.

I.e. we need to fight the root of this problem (the steady introduction
of needlessly global symbols), not its symptoms (the needlessly global
symbols themselves).

Let me raise some thoughts about what we could do to achieve these
goals.

Firstly, the methodology: you are using "make namespacecheck" on an
allyesconfig kernel to find these 'needlessly global' sites, correct? Do
you perhaps also generate them via Sparse, or via some other special
scripting? Could you perhaps describe how you generate them? Do you
create the patches manually perhaps?

We could extend "make namespacecheck" to emit a patch that just marks
all symbols static that are needlessly global.

Firstly, the practical problem: today "make namespacecheck" emits way
too many false positives even on an allyesconfig build - this is due to
some symbols only being utilized on certain config variations and on
certain architectures. A healthy set of false positives has accumulated
during the past few years.

Nowhere do we truly document the symbols that are _purposefully_ global
but "make namespacecheck" complains about them. If we documented them we
could improve the documentation of the kernel.

The (rather trivial) annotation would look like this:

__kernel_api int arch_reinit_sched_domains(void)

this marks the arch_reinit_sched_domains symbol in kernel/sched.c as
"intentionally global" - and it would cause "make namespacecheck" to not
emit warnings about that symbol by default. These annotations are for
global symbols that are not always utilized in an allyesconfig build,
but which we still want to keep global nevertheless.

( Note that because we'd only annotate the sites that trigger "make
namespacecheck" on an allyesconfig build the overwhelming majority of
kernel-internal APIs would stay untouched. )

My estimation is that only about 1 out of 10 new symbols that "make
namespacecheck" complains about in a new kernel release needs to be
annotated this way. I.e. we could turn 10 trivial "global => static"
patches into 1 trivial annotation patch => a 10x reduction in the rate
of patches!

Secondly, we could introduce a new "make namespacecheck_fix" check
method. Maintainers could (optionally) use it the following way:

make namespacecheck_fix arch/x86/
make namespacecheck_fix kernel/sched.c

this would take the namespacecheck output and would apply it to the
source - basically emitting a patch.

Subsystem maintainers do not _have to_ run this, it's not in any way
compulsory (they could just opt to wait for your patches), it's just an
optional tool like the many other tools we have in the kernel today that
make life easier for maintainers.

I can see five immediate benefits from all this:

- this way we can push back these trivial patches to the subsystem
maintainers who introduce them, and it would not pollute our global
workflow at all.

- due to the annotation multiple such trivial changes could be collapsed
into a single patch per subsystem - reducing the per patch maintenance
overhead even more. The maintainer only needs to check whether it
contains some new, intentionally-global symbol. (this is relatively
rare - the majority of new global symbols is accidental)

- we'd also document the kernel better, via those "purposefully global"
__kernel_api annotations.

- we could concentrate these changes to a single point in time during
the kernel cycle - just like we do documentation and typo patches in a
single point too. We'd do fewer patches and we'd do them less
frequently. That too reduces disruption and increases the net positive
effects of these patches.

- another benefit would be for newbies: they could run this tool and
could generate patches - saving you and kernel maintainers from having
to spend quality time on such trivial issues, and giving kernel
newbies an easy first introduction to Linux kernel modifications.

What do you think?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/