Re: [PATCH v2 2/3] lib/vsprintf: Split out sprintf() and friends

From: Petr Mladek
Date: Mon Aug 14 2023 - 11:17:32 EST


On Thu 2023-08-10 11:09:20, Rasmus Villemoes wrote:
> On 10/08/2023 10.15, Petr Mladek wrote:
>
> > Everyone agrees that kernel.h should be removed. But there are always
> > more possibilities where to move the definitions. For this, the use
> > in C files must be considered. Otherwise, it is just a try&hope approach.
> >
> >> Also, please, go through all of them and tell, how many of them are using
> >> stuff from kernel.h besides sprintf.h and ARRAY_SIZE() (which I plan
> >> for a long time to split from kernel.h)?
> >
> > I am all for removing vsprintf declarations from linux.h.
> >
> > I provided the above numbers to support the idea of moving them
> > into printk.h.
> >
> > The numbers show that the vsprintf function famility is used
> > quite frequently. IMHO, creating an extra tiny include file
> > will create more harm then good. By the harm I mean:
> >
> > + churn when updating 1/6 of source files
>
> Well, we probably shouldn't do 5000 single-line patches to add that
> sprintf.h include, and another 10000 to add an array-macros.h include
> (just as an example). Some tooling and reasonable batching would
> probably be required. Churn it will be, but how many thousands of
> patches were done to make i2c drivers' probe methods lose a parameter
> (first converting them all to .probe_new, then another round to again
> assign to .probe when that prototype was changed). That's just the cost
> of any tree-wide change in a tree our size.

OK.

> > + prolonging the list of #include lines in .c file. It will
> > not help with maintainability which was one of the motivation
> > in this patchset.
>
> We really have to stop pretending it's ok to rely on header a.h
> automatically pulling in b.h, if a .c file actually uses something
> declared in b.h.

Yes, we need to find some ballance.

> > + an extra work for people using vsprintf function family in
> > new .c files. People are used to get them for free,
> > together with printk().
>
> This is flawed. Not every C source file does a printk, or uses anything
> else from printk.h. E.g. a lot of drivers only do the dev_err() family,
> some subsystems have their own wrappers, etc. So by moving the
> declarations to printk.h you just replace the kernel.h with something
> equally bad (essentially all existing headers are bad because they all
> include each other recursively). Also, by not moving the declarations to
> a separate header, you're ignoring the fact that your own numbers show
> that 5/6 of the kernel's TUs would become _smaller_ by not having to
> parse those declarations. And the 1/6 that do use sprintf() may become
> smaller by thousands of lines once they can avoid kernel.h and all that
> that includes recursively.

OK, I did some grepping:

## total number of .c files
pmladek@alley:/prace/kernel/linux> find . -name *.c | wc -l
32319

# printk() usage:

## .c files with printk() calls:
$> git grep "printk(\|pr_\(emerg\|alert\|crit\|err\|warn\|notice\|info\|cont\|debug\)(" | cut -d ":" -f 1 | uniq | grep "\.c$" | wc -l
8966

=> 28% .c files use printk() directly

## .h files with printk() calls:
$> git grep "printk(\|pr_\(emerg\|alert\|crit\|err\|warn\|notice\|info\|cont\|debug\)(" | cut -d ":" -f 1 | uniq | grep "\.h$" | wc -l
1006

=> the number is probably much higher because it is also used
in 1000+ header files.


# vprintf() usage:

## .c files where printk() functions are use without vprintf() functions
$> grep -f printf.list -v printk.list | wc -l
6725

=> 21% .c files use vprintf() functions directly


# unique usage:

## .c files where vprintf() family functions are used directly
$> git grep sc*n*printf | cut -d : -f1 | uniq | grep "\.c$" | wc -l
5254

=> 75% .c of files using printk() are not using vprintf()

## .c files where vprintf() functions are use without printk() functions
$> grep -f printk.list -v printf.list | wc -l
3045

=> 45% .c of files using vprintf() are not using printk()


My view:

The overlap will likely be bigger because vprintk() family is often
used directly in .c files but printk() is quite frequently used
indirectly via .h files.

But still, there seems to be non-trivial number of .c files which use
vprintf() and not printk().

=> The split might help after all.

In each case, I do not want to discuss this to the death. And will
not block this patch.

Best Regards,
Petr