View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0008561CMakeCMakepublic2009-02-19 08:422009-03-17 13:09
ReporterJussiP 
Assigned ToBrad King 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionno change required 
PlatformOSOS Version
Product VersionCMake-2-6 
Target VersionFixed in Version 
Summary0008561: Dependency scanner is too slow for OpenOffice.org
DescriptionI have tried building OpenOffice.org with CMake. Currently CMake's dependency scanner is way too slow for it to be used for compiling. As a test we compiled only the word processor component on Linux using the Makefile generator. After a full build we built it again without changing a single source file.

CMake takes over a minute just to determine that nothing has changed. Processor load is over 90%. OpenOffice.org's current build system takes less than five seconds.
Additional InformationThe benchmark results are here: http://tools.openoffice.org/servlets/ReadMsg?list=dev&msgNo=6839 [^]

This test has been run on two different machines. I used 2.6.2 from Debian Unstable as well as the newest binary from cmake.org.

I can send CMakeLists and source, but I don't attach them yet because it is both huge and quite tricky to get working.
TagsNo tags attached.
Attached Fileszip file icon cmakes.zip [^] (45,317 bytes) 2009-02-19 09:43

 Relationships

  Notes
(0015205)
Bill Hoffman (manager)
2009-02-19 09:06

At least the actual build was 4 minutes faster. One thing to try would be to use include_regular_expression to limit the depend information that is checked. Since the build is complicated, is there a way to get a login to a machine that already has the build? What sort of depend information does dmake do? Does it have header level depends?
(0015206)
JussiP (reporter)
2009-02-19 09:14

I'm going to be away from the Internet for a week starting tomorrow and don't really have that much knowledge about dmake. But I'll ask some of the OOo devs to give details.
(0015207)
Bill Hoffman (manager)
2009-02-19 09:34

Well, if before you left you could put the build instructions and build files some place where we could get at them, we might be able to look at the issue.
(0015208)
Jan Holesovsky (reporter)
2009-02-19 09:35

OOo build system uses an external tool for dependencies, mkdepend, originally of the XFree86 fame, but patched to perform better with deep header dependencies (the original version's time grew exponentially with the depth).

Unfortunately no machine where I could setup a login for you :-( But if you have a machine with openSUSE 11.1, it contains OpenOffice_org-*-devel packages that would allow you building of Writer only - if it's possible for you, please come to IRC, irc.freenode.net, channel #go-oo, we'll try our best to get you setup.
(0015209)
JussiP (reporter)
2009-02-19 09:51

Here are the CMakeLists I used. To get it working, I did the following. YMMV

Download Debian's OOo source version 3.0.1~rc1-2.
Install build dependencies.
Do the bootstrap thing and configure for the source. There are docs for this in the OOo site.
When it starts compiling, hit ctrl-c.
The directory that has Writer is in ooo-build/build/<milestone number, don't remember exactly what>/sw
I copied this directory outside the source dir. Should not be necessary, though.
Extract the zip in sw.
Run CMake and build.
(0015210)
JussiP (reporter)
2009-02-19 09:54

I forgot one step: edit kludges.cmake so the include path points to wherever you have your main source.
(0015213)
Brad King (manager)
2009-02-19 10:27

"Install build dependencies." is a big step. On my debian testing I ran

  apt-get build-dep openoffice.org-writer

and got a large number of packages. However, my distro uses openoffice.org-writer version "1:2.4.1-17+b1". Will it's build deps be new enough for source version 3.0?
(0015215)
JussiP (reporter)
2009-02-19 10:34

A few weeks ago even Unstable was not new enough to build OOo 3 so probably not. OpenOffice.org is a monster in lots of different ways. :)
(0015237)
Brad King (manager)
2009-02-19 16:37

Can you please post a build log from "make" when run on the up-to-date tree?
(0015241)
JussiP (reporter)
2009-02-20 04:33

I don't have access to it, but from memory it goes like this:

[ 0%] built target foo1
*** a long pause here with no output ***
[ 3%] built target foo2
*** another long pause here ***
[ 7%] built target foo3

And so on and so on.

Also, I might have given this bug a slightly misleading title. The actual scanning of dependencies (when CMake prints 'scanning dependencies of target foo')is not the issue. The problem comes when those deps are traversed, or walked or inspected or whatever the correct term is.

Another thing is that there are actually two different OOo codebases with slightly different build methods. The first one is the official source from Sun, which is documented at openoffice.org. The second one is the go-oo.org version, which is used by Debian and probably most other distros as well.
(0015242)
Brad King (manager)
2009-02-20 08:42

When you say the processor load is 90%, what process is it? CMake or make?
(0015243)
JussiP (reporter)
2009-02-20 08:46

I did not check that. Sorry.
(0015244)
Bill Hoffman (manager)
2009-02-20 09:11

Can you tar up one of the foo directories out of the build tree? In particular I am looking for the depend.make files. Sounds like the problem is at make time, and may just be too many stats to do. Although that would not cause high cpu???
(0015245)
JussiP (reporter)
2009-02-20 09:19

I can, but only after about 9 days from now. Kendy, can you attach yours?
(0015246)
Bill Hoffman (manager)
2009-02-20 09:44

Can I just checkout OO from an svn or cvs repository and build it? Looking that the cmake files it does not seem to depend on tons of outside stuff, just gtk.
(0015249)
Jan Holesovsky (reporter)
2009-02-20 09:59

I guess that it's just that the cmake files are incomplete wrt. checking all the outside stuff. But I've heard "Debian" somewhere in these comments, and "apt-get build-dep openoffice.org-writer" that will get you the build dependencies - then you can easily follow the advice on http://go-oo.org/developers/ [^] on how to get a build.

I tried to tar the +- interesting stuff here: http://labs.suse.cz/kendy/cmake/ [^] , but most probably there will be missing pieces; the cmake lists are in sw/ , sw/builddir is where I started cmake. It contains even the built .o's [x86-64], if you don't want them, and want something smaller than 155M, please let me know, I'll try to produce something smaller ;-)
(0015257)
Brad King (manager)
2009-02-20 13:01

Thanks for the tarball. Look at the depend.make and depend.internal files in it. They are HUGE. CMake's implicit dependency scanning is bringing in a very large number of header files for every object (over 2000). Most of them are in a path like "/local/ooxml/ooxml.m39/solver/300/unxlngx6.pro" which is outside the main tree. My guess is that dmake is not getting those dependencies.

What are all those files? Does OOo really ask the compiler to include all of them? How large is the preprocessed output of SwGrammarContact.cxx, for example?

  cd core/txtnode && make SwGrammarContact.cxx.i
(0015258)
Jan Holesovsky (reporter)
2009-02-20 13:15

SwGrammarContact.cxx.i is now in http://labs.suse.cz/kendy/cmake/ [^] . You can also see there the dependencies generated by mkdepend for dmake as s_SwGrammarContact.dpcc.
(0015259)
Brad King (manager)
2009-02-20 13:29

Clearly mkdepend is not getting as many header files as CMake does. The question is why does CMake get all the extra files (?). Is there a lot of conditional inclusion involved? CMake does not actually preprocess files to get dependencies, so it conservatively follows all possible includes even if they would be skipped by the preprocessor.
(0015260)
Bill Hoffman (manager)
2009-02-20 13:38

Is there a common regex that could be used to keep just some of them? It might be a quick fix for a proof of concept that CMake can build things faster. This might be a good excuse to add something that limits the depend stuff to the build and source trees.
(0015268)
Brad King (manager)
2009-02-20 14:37

Okay, I think it is the "precompiled_*" header files that all the sources include. They do conditional inclusion of many headers, and since CMake is not preprocessing anything it conservatively follows ALL the PCH references and all their dependencies for every object file. Another side effect of this is that changing any one header file will cause all sources to rebuild. Add this to the top of the project:

include_regular_expression("^([^p]|p[^r]|pr[^e]|pre[^c]|prec[^o]|preco[^m]|precom[^p])")

It tells CMake not to follow headers that start in 'precomp'.
(0015545)
JussiP (reporter)
2009-03-04 14:55

I tried Brad's fix. Here are my results

Full build from scratch, one process

without fix 25:19
with fix 19:03

Nothing changed, one process

without fix 2:06
with fix 19.692

Full build from scratch, 3 processes

without fix 13:11
with fix 9:59

Nothing changed, 3 processes

without fix 2:09
with fix 10.77


The build is now faster and the empty case is ~ten times faster. This makes CMake faster when building and almost as fast when nothing has changed. (Note the usual caveats about dmake/build.pl doing more stuff, Kendy's machine being faster than mine etc).
(0015546)
Brad King (manager)
2009-03-04 15:04

Typically a (non-CMake) Makefile's dependency scanning rule depends on the headers that were scanned last time so that if one changes dependencies are re-scanned. However, if a header gets deleted then Make refuses to run the re-scanning rule!

CMake does some extra checking for the missing-header case, which might be taking the extra time. What does the dmake system do if you delete a header file?
(0015548)
Jan Holesovsky (reporter)
2009-03-04 15:49

Breaks ;-) You have to delete the affected .dpcc files manually; or at least I do not know of a better way to get around that.

Is it possible to turn this extra check off for the testing purposes?
(0015552)
Brad King (manager)
2009-03-04 16:06

No, not without editing CMake's implementation and rebuilding it. The need-to-rescan dependency check has already been moved to a cmake process instead of encoded in the Makefiles and has the missing header check integrated. Removing it will totally disable re-scanning of dependencies.

This is all due to a fundamental limitation of Make tools. There is no way to say "bring that target up to date before loading the rules for this target", except by using multiple recursive make invocations. When there is nothing to do, all the timestamps actually have to be checked twice...once by CMake to determine it does not need to re-scan dependencies, and once by make to determine it does not need to compile anything. Even if the first round were also done by make, it would still have to be done twice. The second round must be done in a separate recursive make process in case the first round re-scans and therefore edits the rules the second round will load.

One idea to speed things up is to take advantage of the fact that the dependency-rescan check deletes object files for sources whose dependencies will be re-scanned to make sure they re-build. The make tool does not actually need to check the header dependencies. This is a bit tricky though because doing a "make foo/fast" builds target foo without re-checking dependencies, so in this case make does need the header dependencies.
(0015715)
Brad King (manager)
2009-03-17 13:09

The include_regular_expression fix to OOo seems to have resolved the primary issue. I'm closing this issue for now. Please re-open if you have further dependency performance problems.

 Issue History
Date Modified Username Field Change
2009-02-19 08:42 JussiP New Issue
2009-02-19 08:59 Bill Hoffman Status new => assigned
2009-02-19 08:59 Bill Hoffman Assigned To => Brad King
2009-02-19 09:06 Bill Hoffman Note Added: 0015205
2009-02-19 09:14 JussiP Note Added: 0015206
2009-02-19 09:34 Bill Hoffman Note Added: 0015207
2009-02-19 09:35 Jan Holesovsky Note Added: 0015208
2009-02-19 09:43 JussiP File Added: cmakes.zip
2009-02-19 09:51 JussiP Note Added: 0015209
2009-02-19 09:54 JussiP Note Added: 0015210
2009-02-19 10:27 Brad King Note Added: 0015213
2009-02-19 10:34 JussiP Note Added: 0015215
2009-02-19 16:37 Brad King Note Added: 0015237
2009-02-20 04:33 JussiP Note Added: 0015241
2009-02-20 08:42 Brad King Note Added: 0015242
2009-02-20 08:46 JussiP Note Added: 0015243
2009-02-20 09:11 Bill Hoffman Note Added: 0015244
2009-02-20 09:19 JussiP Note Added: 0015245
2009-02-20 09:44 Bill Hoffman Note Added: 0015246
2009-02-20 09:59 Jan Holesovsky Note Added: 0015249
2009-02-20 13:01 Brad King Note Added: 0015257
2009-02-20 13:15 Jan Holesovsky Note Added: 0015258
2009-02-20 13:29 Brad King Note Added: 0015259
2009-02-20 13:38 Bill Hoffman Note Added: 0015260
2009-02-20 14:37 Brad King Note Added: 0015268
2009-03-04 14:55 JussiP Note Added: 0015545
2009-03-04 15:04 Brad King Note Added: 0015546
2009-03-04 15:49 Jan Holesovsky Note Added: 0015548
2009-03-04 16:06 Brad King Note Added: 0015552
2009-03-17 13:09 Brad King Note Added: 0015715
2009-03-17 13:09 Brad King Status assigned => closed
2009-03-17 13:09 Brad King Resolution open => no change required


Copyright © 2000 - 2018 MantisBT Team