Re: source dependencies cleanup?

Peter T. Breuer (
Wed, 4 Dec 1996 15:56:53 +0000 (WET)

"A month of sundays ago Paul Flinders wrote:"
> > From: Peter T. Breuer <>
> > You are right and mkdep is WRONG about genhd.c.
> I'm glad we're beginning to agree :-)

In a sense.

> > here. The implementation does not follow the spec.
> I couldn't find a "spec" for mkdep. Is there one in the source tree, if so
> where (I must have missed it).

The spec is: compute (at least) the dependencies of all given files.
Oh oh! I am dumb! It only needs to compute the primary dependencies!
Make itself will do the recursive bit! So we are fussing about nothing
here at least.

Yes. Once we have made a prior pass with mkdep, then all the .depend
files with the primary dependency info in are included in the
Makefiles, so now make itself can do the recursive tracing that it
usually does.

Comments (and specs!) are sometimes lacking in the source tree.

> > It is supposed to form the transitive closure of the #include'd files and here
> > we have a case where it does not. Don't know why yet.

I am wrong. It needs to compute only the primary dependencies if make

> As far as I can see mkdep makes no attempt to follow nested includes
> and that this is a design decision not a bug (that global "filename" variable

Yes. I was wrong. It computes only primary dependencies and that is
correct. Gcc computes too many dependencies in that sense.

> for instance). This is supported by the following comment at the top of
> depend.awk which I presume is mkdep's predecessor

Mkdep.c is a C program. There is no preprocessor. Depend.awk is
the original version of mkdep that preceded the perl script that
(briefly and optionally) preceded the C version.

> # This is an awk script which does dependencies. We do NOT want it to
> # recursively follow #include directives.

Yes. The comment is correct. They ought to say why!

> I think that not following nested includes is wrong as I've already pointed
> out in the discussion on genhd. Unfortunately in a reasonably complex

It appears to be correct in principle. Are you sure that it is wrong
here? I just decided that I was wrong in crying wolf on the differences
between mkdep and gcc dependency output on genhd.c ...

> mkdep appears to achieve speed-up by only including the "first level"
> of include file in the dependancy output.

That, as I think we both realize now, is true AND correct.

> However generating the dependancies as part of the compile makes
> more sense as you have to expand everything anyway and the additional
> housekeeping is very small.

I already gave the math on this.

> > No - at the moment you have (just :-) well done!) found a bug in the mkdep
> > implementation. The math stands and Il get back to you on that.
> Your math as originally presented seems to be based on the flawed
> assumption that using -MD generated dependancies requires all source
> files to be compiled to get their dependancies.

Yes. The assumption is not flawed. It does so require.

> This is not true -

It is true.

> you don't
> *need* the dependancies if you aren't going to compile a given source

That is altogether _another_ question. Define "need"!

I tell you that I "need" (leaving aside what it means) all the
dependency information for the whole kernel at the moment, even though I
am not presently using all of it. I may be wrong on that, but we can
argue about it once we decide what "need" means. There is a fair chance
that you are right and that we can get away without all the dependency
info, but I need a proof of that, and I need a statement of what the
words mean before we can decide the truth,

> file and later if you do the fact that you need to build the object file is enough
> to trigger dependancy generation.

This is where I think it is by now too late to find out what the
dependencies really are. This is the point we have to examine. It
could be OK.

> If we assume that a correctly implemented "mkdep" and a correctly
> implemented -MD solution cause the same files to be re-compiled
> then you have a point because if Y is the (total) compilation time and X
> is the fraction added for -MD we have

Well, I was taking X to be the _absolute_ time added for -MD of all files,
but OK ...

> mkdep = %files to re-build * Y

This appears to be the time required to rebuild all files multiplied by
the percentage of files that are actually rebuilt. That should be the
time required to actually do the compilation of the kernel, not
considering the time used to calculate the dependencies. But that is not
what I calculated. I calculated

i] Time required to calculate ALL dependencies and make the kernel.

You have calculated

ii] Time required to make kernel and make SOME dependencies.
(in the case of your mkdep calc above, SOME=NONE)

For the gcc -MD method, these are not the same thing either. Your point is
therefore that we only need to make some dependencies, and I remain to be
convinced about that. It may be so.

> -MD = %files to re-build * Y + %files to re-build * X * Y

This is time to make kernel + time to recompute dependencies for the
files that were recompiled. I.e. ii] above.

What I calculated was time to compute all dependencies plus time to
make kernel. That goes as follows:

i] with mkdep (Y is time to recompile all files)

one pass with mkdep costs just about nothing in time.
+ %files to rebuild according to mkdep * Y

I think I suggested that the percentage here was 60% (I can remember)
because mkdep in principle should give an overestimate of the set of
dependencies because it does not figure out #if's (not sure if gcc does
either!) and that probably the "right" number of files to rebuild is 50%.

So we get

0 + 0.6 * Y = 0.6 * Y

ii] with gcc -MD. Again Y is time to recompile all files.

one pass with gcc -MD over ALL files for dependencies and .o
+ extra cost of dependency building as we go

I assume that the extra cost is nil. The problem here is taht we have
to compile all files. The answer here is

Y = 1.0 * Y

iii] with a prior pass by gcc -M.

%time per file scanning for deps in a first pass * Y
+%files to rebuild according to gcc *Y

here I suggested that the cost was 20% extra per file on the extra
pass to find and chase its dependencies, but that gcc gets the more
exact answer of 50% to rebuild

0.2 * Y + 0.5 * Y
= 0.7 * Y

So all that is above board. What we have to settle is what "need" means
in terms of dependencies. Clearly, in order to do the task that is
presently being done - calc .o's and ALL deps - mkdep is better.

> However as I've already pointed out X is not easily measurable as it disappears
> into the noise (ie rather less than 1%). Also the total time for a build, from a

That OK. I agree. It's nothing.

> Measured over repeated builds we have better functionality (complete
> depenancies which are kept automatically up-to-date) at a slight

I disagree - at least I don't agree. Prove to me that you don't "need"
to calculate all dependencies at every upgrade and I'll believe you!

Peter T. Breuer Phd. Ing.,
Area de Ingenieria Telematica E-mail:
Dpto. Ingenieria Tel: +34 1 624 99 47
Universidad Carlos III de Madrid Fax: +34 1 624 94 30/65
Butarque 15, E-28911 Leganes URL: