[CMake] FW: Parallel GNU make issue

Hennigan, Gary L glhenni at sandia.gov
Thu Sep 11 16:09:07 EDT 2014


Thanks for the reply Chuck.

Unfortunately there is no such message. I’m not even sure the GNU make jobserver is the problem. It’s just that the symptom is the same as if that were happening.

Am I comparing apples to apples when I go into the build directory created via ctest and invoking gmake manually, and comparing that to the way ctest is invoking gmake? Manually, after I notice the ctest build has regressed to a single compile at a time, I kill ctest then I go to the build directory it created and do

gmake clean
find . –name ‘*.o’ –print|xargs rm
touch CMakeCache.txt
gmake –j 10

As I said, there’s a dramatic difference between the two. The manual build proceeds as I would expect and is done within an hour. The ctest build takes many times longer.

I should’ve mentioned that I’m using the 2.8.12.2 version of the tools.

Gary

From: Chuck Atkins [mailto:chuck.atkins at kitware.com]
Sent: Thursday, September 11, 2014 1:03 PM
To: Hennigan, Gary L
Cc: cmake at cmake.org
Subject: [EXTERNAL] Re: [CMake] FW: Parallel GNU make issue

Hi Gary,
Do you see either of these two warning messages show up:

"warning: -jN forced in submake: disabling jobserver mode."
or
"warning: jobserver unavailable: using -j1. Add `+' to parent make rule."

These warnings often accompany the forced serialization of a parallel make build, although usually they show up regardless of the launch method, i.e. ctests vs manual make.  These warnings are often indicative of a particular problem in super-build scenarios..

- Chuck

On Thu, Sep 11, 2014 at 2:29 PM, Hennigan, Gary L <glhenni at sandia.gov<mailto:glhenni at sandia.gov>> wrote:
I have a strange, and very frustrating, problem. I have a pretty large piece of software that I build nightly as part of regression testing of my own software. All of the software uses CMake and I use a ctest script, via “ctest –S [script file]”, for my nightly regression testing . As I stated, this is a pretty large collection of software but during development it’s not a huge issue because the build is quite parallelizable via GNU make’s “-j N” option. On my nightly test platform, a 64-core machine, I can build the whole thing in about an hour.  A nice manageable amount of time for a nightly regression test. Unfortunately when I run the build process via ctest something is causing the parallel make to fail and I’m lucky if the build takes under 15 hours. Barely practical for a nightly test.

I’m not sure how to find out what’s going on. After the ctest build I can go into the build directory, do a “make clean” and then a “make –j 12”, for example, and the build flies. Of course I can build the software entirely outside of ctest and it too flies. Only when the build happens as part of ctest does it seem to revert to, essentially, a “make –j 1” and slow to a crawl.

I can look at the process tree, via “ps –ef”, during the ctest build and I see the root invocation of gmake and it’s fine. For example, it typically looks something like:

  PID  PPID  C STIME TTY          TIME CMD
6141  6283 96 11:41 pts/0    00:24:40 ctest -VV -S ctest_nightly.cmake -DPROCESSORCOUNT=12
8032  6141  0 11:42 pts/0    00:00:00 /usr/bin/gmake -i -j 12
8035  8032  0 11:42 pts/0    00:00:00 /usr/bin/gmake -f CMakeFiles/Makefile2 all
  851  8035  0 11:52 pts/0    00:00:00 /usr/bin/gmake -f packages/ml/src/CMakeFiles/ml.dir/build.make
27797  8035  0 11:46 pts/0    00:00:00 /usr/bin/gmake -f packages/moocho/src/CMakeFiles/moocho.dir/build.make

You can see that the parent make, PID 8032 which is started via ctest (PID 6141), has the appropriate flag, “-j 12”, but at this point in the build it’s compiling one file at a time. Another odd thing is that I think the build starts out fine, invoking multiple file compilations simultaneously, but after a couple of minutes it reverts to essentially the “make –j 1” behavior. It’s like the GNU make jobserver is failing, but I’m not getting any error messages from GNU make to that affect.

If anyone has any suggestions on how I can figure this out I’d appreciate it.

Apologies for the lengthy explanation. I’ve been pulling my hair out trying to figure out how to solve this issue.

Thanks in advance,
Gary



--

Powered by www.kitware.com<http://www.kitware.com>

Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/cmake

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cmake/attachments/20140911/3b31dfab/attachment-0001.html>


More information about the CMake mailing list