[CMake] Obtaining improved GNU make performance on Makefiles generated by cmake

Alan W. Irwin irwin at beluga.phys.uvic.ca
Tue Mar 18 07:38:49 EDT 2008


On 2008-03-17 21:22-0400 Bill Hoffman wrote:

> Alan W. Irwin wrote:
>> The GNU make documentation states the following:
>> 
>>    Since it knows that phony targets do not name actual files that
>>    could be remade from other files, make skips the implicit rule search
>>    for phony targets.... This is why declaring a
>>    target phony is good for [make] performance....
>> 
>> Also,
>> 
>>    Using .PHONY' is more explicit and more efficient.  However, other
>>    versions of make' do not support .PHONY'; thus FORCE (an arbitrarily
>>    named rule with no prerequisites or rules) appears in
>>    many makefiles.
>> 
>> As part of another investigation I searched a Linux build tree created by
>> cmake (2.4.8) and was surprised to find no reference to .PHONY. Instead, 
>> the makefile generator on Linux is using the same method as FORCE idea
>> above, i.e., a rule called cmake_force with no prerequisites or commands 
>> to
>> serve as a prerequisite to rules that must be run every time.
>> 
>> CMake is missing a bet on Linux systems to reduce Makefile overhead since 
>> it
>> is using this cmake_force approach rather than the preferred more
>> efficient.PHONY approach for rules that must be run every time.  Since the
>> Makefiles generated by cmake have an extremely large number of such rules,
>> Makefile latency may be significantly reduced by this method on Linux (GNU
>> make) systems.
>> 
>> If the cmake developers here like this idea (or at least don't strongly
>> dislike it), I will go ahead and make a feature request so it doesn't get
>> lost.
>> 
>
> CMake is written to generic make, and I don't think we would want to add 
> something that only worked with gmake.  The trouble is the make you are using 
> can change after CMake is run, so we can not even test for the version of 
> make being used.   I guess if it was a big enough performance gain, we could 
> add some sort of option to allow for this.  However, I would want to make 
> sure that there would be a good payback.  So far the cmake makefiles 
> outperform the autotools ones quite well, and performance has not been an 
> issue.   Do you think you are having a performance problem?

autotools runs the libtool shell script for _every_ compile or link to
decide on build options.  That script is huge (it weighs in at 200K or so).
With built-in shell-script latency like that, any added latency from
the autotools-generated Makefile is probably going to be completely
undetectable.

So to answer your last question in general, I am satisfied with CMake build
performance when compared with autotools, but that's not really saying much
because your comparing with the huge latancy of the libtool shell script. Of
course, autotools has been the traditional standard of comparison, but it is
probably time to move on from that and instead the goal should be to
decrease the latency of builds that have been configured with CMake if it is
straightforward to do so.

To get a lot more specific about your question, the PLplot build has a
latency (time to do a second make after the first one is completed with
everything built on a 2.4GHz Linux box) of the following:

time make >& /dev/null

real    0m1.338s
user    0m0.748s
sys     0m0.700s

(I also checked what was diverted to /dev/null in the above to limit output
to confirm nothing was built by the above [second] make command). If I then
touch the source code for our simplest example (a small C executable which
links to our core library), here is the result I get

software at raven> time make >& /dev/null

real    0m1.413s
user    0m0.704s
sys     0m0.796s

(I also confirmed that the small C executable and only that PLplot software
component was built after the touch command.) In this case, the build time
without latency is 1.413-1.338 = 0.075 seconds. Another way to say this is
the latency completely dominates using up 95 per cent of the total time. If
the latency weren't there, you would get a factor of 20 improvement in
speed.  (This result presumes perfect repeatable timing which is not the
case, but nevertheless it is clear the total build time for the PLplot build
system is dominated by latency for this case of a simple executable with
rather a small number of lines of code that links to our core library.)

I then tried the experiment of touching one of our bigger C source code
files that is built for our core library.  It involves a file with many more
lines of code then the simple example and also involves the relinking of the
library and relinking of the many examples and other libraries that link to
our core library.  So this is a much more realistic test of a typical PLplot
developer cycle.

software at raven> time make >& /dev/null

real    0m8.670s
user    0m6.080s
sys     0m2.612s

So in this case the build time without latency is 8.670-1.338 = 7.332 seconds.
or another way to say this is the latency is roughly 20 per cent of the total
time and if it were not there you would get a 20 per cent improvement in
speed.

I personally hate latency since you are always paying that overhead no
matter how simple the task.  Anyhow, if you asked most developers here, I
think they would be quite happy with anywhere from 20 per cent to a factor
of 20 speedup each time they tweaked source code for their project.
Furthermore, I believe CMake developers will be sympathetic to this
anti-latency point of view since they have already gone to a lot of trouble
to avoid latency at the cmake stage by caching information.

Bill, if you want to take this further, would it be possible to send me a
test CMake patch to always use just the recommended .PHONY command rather
than the present cmake_force?  For such a test patch you don't have to worry
about the non-GNU make case which should considerably simplify the patch. If
that change proved to significantly reduce the above latency that proof of
concept would motivate working on a more general patch which could deal
properly with the non-GNU make case.

To summarize, the above quote from GNU make documentation shows they were
concerned about the latency of the "force" alternative to .PHONY. If you
have ever run GNU make in debug mode, there is a lot of stuff it goes
through to look for implicit rules that fit the target so for the very large
number of Makefile targets that CMake generates that are not files, there is
a lot of unneeded implicit rule checking going on.  That work would be
completely eliminated by appropriate use of .PHONY, and because so many
rules are involved, I suspect that would give a significant reduction in
latency.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state implementation
for stellar interiors (freeeos.sf.net); PLplot scientific plotting software
package (plplot.org); the libLASi project (unifont.org/lasi); the Loads of
Linux Links project (loll.sf.net); and the Linux Brochure Project
(lbproject.sf.net).
__________________________

Linux-powered Science
__________________________


More information about the CMake mailing list