Perl make depend made faster

Luca Lizzeri (lizzeri@mbox.vol.it)
Fri, 13 Sep 1996 15:18:01 +0200 (MET DST)


Version 2.1 is coming. In view of the staggering amount of times that
"make dep" will be run around the world I implemented a faster version of
it.

Reports on linux-kernel in July told of order of magnitude improvement on
make dep times (*) by simply running depend.awk through a2p and changing
the dep rules

((*) this on low end machines)

I tried that. On a P75/16MB the gain was negligible and there were a
couple of subtle bugs. So I dug out the Camel, jumped back and forth for
a while in the perl online docs, and I came up with a reworked version
(the Straightforward revision). In perl5.003 the timing was nice in 5.001
more so (I printed here the 5.003 timings), and the bugs were gone. Such
paltry gains however were not a worthy gift for the Master, so I
sacrificed some more time to Linux and came up with the Revolution
Revision.

I was pleased by the result.

The Revolution Revision features liberal use of ram and some perl5isms, so
it will not be to all tastes ( perl5isms are however expendable ).

Most of the time in depend.awk was taken up in keeping track of comments,
so I decided to do away with state and get rid of the comments all in one
go. This requires pre-reading the contents of the current file into a
single string ( yes, all of it ! ) and do something like an s///g on it.
The rest is: less floating point ops (I hope, for the sake of low end
machines ), less pattern matches, some clarification thrown in. Only some.
Perl is not Eiffel.

The decision to read the file all at once is in most cases tenable,
considering that except a few pathological cases ( dgrs_firmware.c and
advansys.h) most source files in the kernel are way under 100k. Even while
doing drivers/net the size field for perl in top does not go over 3.5 MB
(from the usual 1.2 MB).

Here are some sample timings on an idle system (buffer cache primed by a
previous run in all cases): Wall time will tell.

Pentium 75/16 MB, slow ide

depend.awk

91.89user 9.11system 2:20.45elapsed 71%CPU
92.21user 8.42system 2:18.81elapsed 72%CPU

Straightforward depend.pl ( a2p + bug fixes + massage )

80.06user 9.89system 1:58.85elapsed 75%CPU
80.08user 9.59system 1:58.98elapsed 75%CPU

Revolution depend.pl ( read entire file in one go )

57.57user 11.86system 1:39.00elapsed 70%CPU
57.81user 11.73system 1:38.26elapsed 70%CPU

Pentium 100 / 24 MB fast ide

depend.awk

59.56user 6.26system 1:27.46elapsed 75%CPU
59.85user 5.65system 1:27.14elapsed 75%CPU

Revolution depend.pl
39.34user 7.46system 0:58.76elapsed 79%CPU
39.22user 7.66system 0:58.24elapsed 80%CPU

Hope you like it

Luca Lizzeri

P.S.

Here are a couple of warnings that on gawk under make don't make it to
stderr ( on both my systems ):

ide_modes.h needs config but has not included config file
g_NCR5380.h needs config but has not included config file
dev_table.h needs config but has not included config file

P.P.S.

Is it a question of policy that only #include directives aligned to the
left margin are processed in make dep ? Or is it a regex typo ?
/^[ ]#/ or /^#/ ( lines 85 and 112 ) ?

Here is the patch ( you must have perl5 in your path and enable the perl
depend in the main Makefile). Yell loudly if anyting doesn't work as
it should or if I should really make it perl4 compatible.

diff -u --recursive --new-file linux-2.0.19/Makefile linux/Makefile
--- linux-2.0.19/Makefile Thu Sep 12 14:53:41 1996
+++ linux/Makefile Fri Sep 13 15:00:10 1996
@@ -39,6 +39,11 @@
STRIP =$(CROSS_COMPILE)strip
MAKE =make
AWK =gawk
+PERL =/usr/bin/perl
+# If you want the perl depend uncomment next line (and comment out the
+# following one)
+#DEPEND =$(PERL) $(TOPDIR)/scripts/depend.pl
+DEPEND =$(AWK) -f $(TOPDIR)/scripts/depend.awk

all: do-it-all

@@ -345,7 +350,7 @@
find . -type f -print | sort | xargs sum > .SUMS

dep-files: archdep .hdepend include/linux/version.h
- $(AWK) -f scripts/depend.awk init/*.c > .tmpdepend
+ $(DEPEND) init/*.c > .tmpdepend
set -e; for i in $(SUBDIRS); do $(MAKE) -C $$i fastdep; done
mv .tmpdepend .depend

@@ -385,5 +390,5 @@

.hdepend: dummy
rm -f $@
- $(AWK) -f scripts/depend.awk `find $(HPATH) -name \*.h ! -name modversions.h -print` > .$@
+ $(DEPEND) `find $(HPATH) -name \*.h ! -name modversions.h -print` > .$@
mv .$@ $@
diff -u --recursive --new-file linux-2.0.19/Rules.make linux/Rules.make
--- linux-2.0.19/Rules.make Sat Jul 6 15:47:50 1996
+++ linux/Rules.make Fri Sep 13 14:03:28 1996
@@ -83,7 +83,7 @@
#
fastdep: dummy
if [ -n "$(wildcard *.[chS])" ]; then \
- $(AWK) -f $(TOPDIR)/scripts/depend.awk *.[chS] > .depend; fi
+ $(DEPEND) *.[chS] > .depend; fi
ifdef ALL_SUB_DIRS
set -e; for i in $(ALL_SUB_DIRS); do $(MAKE) -C $$i fastdep; done
endif
diff -u --recursive --new-file linux-2.0.19/scripts/depend.pl linux/scripts/depend.pl
--- linux-2.0.19/scripts/depend.pl Thu Jan 1 01:00:00 1970
+++ linux/scripts/depend.pl Fri Sep 13 14:46:23 1996
@@ -0,0 +1,120 @@
+#!/usr/bin/perl -w
+# depend.pl 1.0, the Revolution Revision
+# This is the old depend.awk drawn, quartered and given new life
+# by Luca Lizzeri ( lizzeri@mbox.vol.it ).
+# Use it, misuse it or modify it as you see fit.
+# If it breaks you get to keep both pieces.
+
+
+# Check sanity of environment
+
+
+die "Environment variable TOPDIR is not set\n" unless $ENV{"TOPDIR"};
+die "Environment variable HPATH is not set\n" unless $ENV{"HPATH"};
+
+
+# Massage HPATH into parray
+
+
+my @parray = split( ' ' , $ENV{"HPATH"} );
+grep ( do { s/^I//; s/[\/\s]*$/\//; } && 0 , @parray );
+
+
+# Iterate on input files
+
+
+foreach $file ( @ARGV ) {
+
+
+ # Sensible if boring initializations ..
+
+
+ my $hasdep = 0, $hasconfig = 0, $needsconfig = 0;
+ my $cmd = '', $depname = '', $relpath = '';
+ my @includes = ();
+
+
+ # Massage filename into $depname, $relpath, $cmd
+
+
+ ( $depname = $file ) =~ s/\.[cS]$/.o: /;
+ ( $relpath = $file) =~ s/\/?[^\/]*// if ( $file =~ /^\./ );
+
+ if ( $depname eq $file ) { # Unchanged .. should be a .h
+ $cmd = "\n\t\@touch " . $depname;
+ $depname =~ s/\.h$/.h: /;
+ }
+
+
+ # Slurp it all up. $program can become quite long. Memory starved
+ # machines will possibly be better served by putting DEPEND=$(AWK)
+ # in the main Makefile.
+
+
+ open ( FILE, $file ) or die "Can't open $file: $!\n";
+ $program = join '', <FILE>;
+ close FILE;
+
+
+ # Strip comments! Fast !
+
+
+ $program =~ s{/\*.*?\*/}{}gsm;
+
+
+ # Fill @includes array ( if you don't understand see man perlop,
+ # man perlre and the Camel )
+
+
+ while ( $program =~ /^([ \t]*)#\s*include[ \t]*[<"](\S*)[">]/gm ) {
+ next if $1; # we don't want indented includes ?
+ push @includes, $2;
+ }
+
+
+ # Check wether file needs <linux/config.h> and has it included
+
+
+ $needsconfig = 1 if ( $program =~ /^[ \t]*#[ \t]*if.*?\bCONFIG_/m );
+ $hasconfig = 1 if grep ( $_ eq "linux/config.h" , @includes );
+ print STDERR "$file needs config but has not included config file\n" if ( $needsconfig && ! $hasconfig );
+ print STDERR "$file doesn't need config\n" if ( $hasconfig && ! $needsconfig );
+
+
+ # Find exact location of included files
+
+
+ foreach $fname ( @includes ) {
+
+ # First try relative to current directory
+
+ $rfname = $relpath . $fname;
+ if ( -e $rfname ) {
+ print $depname unless $hasdep;
+ $hasdep = 1;
+ print " \\\n ", $rfname;
+ if ( $fname =~ /^\./ ) {
+ $fnd = grep ( $rfname eq $_, @ARGV );
+ push ( @ARGV, $rfname ) unless $fnd;
+ }
+ } else {
+
+ # It was not relative to current dir
+
+ foreach $path ( @parray ) {
+ $rfname = $path . $fname;
+ if ( -e $rfname ) {
+ print $depname unless $hasdep;
+ $hasdep = 1;
+ print " \\\n ", $rfname;
+ last;
+ }
+ }
+ }
+ }
+
+ print $cmd, "\n" if $hasdep;
+
+}
+
+exit (0);