MantisBT - CMake
View Issue Details
0008701CMakeCTestpublic2009-03-08 00:062016-06-10 14:30
Roscoe A. Bartlett 
Brad King 
normalmajorhave not tried
closedmoved 
 
 
0008701: CTest Dev: Coverage testing not scaling with number of subprojects

The coverage testing for Trilinos is scaling very poorly as the number
of subprojects increases.

I ran the coverage tests on 3/7/2009 to let it run all day and I had
to kill it before it ran into the next day. Consider the results
shown at:

   http://trilinos-dev.sandia.gov/cdash/index.php?project=Trilinos&date=2009-03-08&display=project#Coverage [^]

If you sort by 'Date' in assending order (i.e. one click on 'Date')
you get:

godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 83.32% 11211 2244 2009-03-07T04:59:51 MST Teuchos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 90.79% 2426 246 2009-03-07T05:01:45 MST RTOp
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 83.49% 905 179 2009-03-07T05:03:12 MST Kokkos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 73.24% 16054 5866 2009-03-07T05:04:04 MST Epetra
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 13.13% 4132 27340 2009-03-07T05:05:26 MST Zoltan
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 87.2% 1247 183 2009-03-07T05:08:56 MST Shards
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 95.73% 763 34 2009-03-07T05:12:31 MST GlobiPack
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 52.44% 688 624 2009-03-07T05:16:33 MST Triutils
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 66.06% 2484 1276 2009-03-07T05:19:55 MST Tpetra
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 68.97% 2465 1109 2009-03-07T05:25:59 MST EpetraExt
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 81.83% 590 131 2009-03-07T05:30:23 MST Stokhos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 87.21% 2086 306 2009-03-07T05:34:45 MST Sacado
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 73.53% 7213 2597 2009-03-07T05:39:55 MST Thyra
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 95.23% 798 40 2009-03-07T05:49:51 MST OptiPack
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 30.33% 810 1861 2009-03-07T05:59:14 MST Isorropia
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 0% 0 0 2009-03-07T06:07:28 MST Pliris
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 0% 0 0 2009-03-07T06:22:04 MST Claps
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 39.8% 5194 7856 2009-03-07T06:38:20 MST AztecOO
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 31.08% 459 1018 2009-03-07T06:56:38 MST Galeri
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 41.01% 1778 2557 2009-03-07T07:16:34 MST Amesos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 59.17% 4755 3281 2009-03-07T07:38:16 MST Ifpack
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 21.77% 199 715 2009-03-07T08:02:32 MST Komplex
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 30.11% 12725 29539 2009-03-07T08:27:34 MST ML
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 72% 5570 2166 2009-03-07T08:55:46 MST Belos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 87.1% 2451 363 2009-03-07T09:27:03 MST Stratimikos
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 89.74% 105 12 2009-03-07T10:00:40 MST Meros
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 55.51% 16107 12908 2009-03-07T10:33:27 MST FEI
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 0% 0 0 2009-03-07T11:09:52 MST RBGen
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 79.3% 12959 3383 2009-03-07T12:11:06 MST Anasazi
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 68.51% 570 262 2009-03-07T13:29:29 MST ThreadPool
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 87.28% 4418 644 2009-03-07T14:41:42 MST Phalanx
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 62.76% 18011 10687 2009-03-07T15:59:59 MST NOX
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 0% 0 0 2009-03-07T17:30:59 MST Moertel
godel.sandia.gov Linux-Nightly-SERIAL_DEBUG_COV 50.18% 829 823 2009-03-07T20:00:00 MST TrilinosCouplings


Looking at this you see that the start times between subproject
(i.e. Trilinos package) builds starts to go up dramatically as you get
on to more and more subprojects disproportionally. The first package
Teuchos is an intermediate size package with 11211 tested lines of
code. The total time for the build/test for it was from 04:59:51 to
5:01:45 or about 1.5 minutes. Also, consider Epetra which is a
slightly large package with 16054 tested lines but it is fourth in the
order and it took from about 5:04:04 to 05:26 or about 20 minutes to
finish.

Now consider the much smaller package Moertel which is 33rd in the
order with 0 tested lines because it did not even build and its one
(current) test did not even run. Yet it took from 17:30:59 to
20:00:00 to completely run. This about 2.5 hours and there could not
even have been a single coverage file for the Moertel package.

I looked that the verbose output from ctest with the -VV option and I
see that it seems to be processing all of the coverage files in all of
the proceeding packages.

For example, I see output like:

 
Running coverage for package 'Moertel' ...


 
-----------------------------------------------

SetCTestConfigurationFromCMakeVariable:CoverageCommand:CTEST_COVERAGE_COMMANDSetCTestConfiguration:CoverageCommand:/usr/bin/gcov
 Add coverage exclude regular expressions.
SetCTestConfiguration:BuildDirectory:/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD
SetCTestConfiguration:SourceDirectory:/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/Trilinos
 label file list [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/CMakeFiles/LabelFiles.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/moertel/example/CMakeFiles/Moertel_TwoSquares.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/amesos/src/CMakeFiles/amesos.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/aztecoo/src/CMakeFiles/aztecoo.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/epetra/src/CMakeFiles/epetra.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/epetra/src/CMakeFiles/epetra_fortran.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/epetraext/src/CMakeFiles/epetraext.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/ml/src/CMakeFiles/ml.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/moertel/src/CMakeFiles/moertel.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/teuchos/src/CMakeFiles/teuchos.dir/Labels.txt]
 loading labels from [/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/triutils/src/CMakeFiles/triutils.dir/Labels.txt]
Performing coverage
 COVFILE environment variable not found, not running bullseye
   Processing coverage (each . represents one file):
    ."/usr/bin/gcov" -l -o "/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/optipack/test/CMakeFiles/OptiPack_VersionUnitTests.dir/__/__/teuchos/test/UnitTes\
t" "/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/BUILD/packages/optipack/test/CMakeFiles/OptiPack_VersionUnitTests.dir/__/__/teuchos/test/UnitTest/Teuchos_StandardUnitTe\
stMain.cpp.gcda"
File '/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/Trilinos/packages/teuchos/test/UnitTest/Teuchos_StandardUnitTestMain.cpp'
Lines executed:100.00% of 5
/home/rabartl/PROJECTS/dashboards/Trilinos.base/SERIAL_DEBUG_COV/Trilinos/packages/teuchos/test/UnitTest/Teuchos_StandardUnitTestMain.cpp:creating 'Teuchos_StandardUnitTestMain.cpp.gcda##Te\
uchos_StandardUnitTestMain.cpp.gcov'

...
 
-----------------------------------------------



why are coverage files for Teuchos and all other Trilinos packages
being processed in later packages like this?.

To effectively debug issues like this it is essentaial that all of the
CTest functions ctest_xxx(...) (where xxx = start, configure, build,
test, coverage, memcheck, sbumit etc) all print and return the
duration time when ctest is run with -VV (or even -V). I could put in
my own timers to our CTest script but it make much more sense to add
this to CTest itself since other proejcts will need this as well.

Note that the coverage results for only the Trilinos package in
question seem to be displayed on the Dashboard. This is great but the
current setup is not scaling on the CTest server side.

What can be done to fix this?

Should our Trilinos driver script just delete all of the coverage
files in:

    ./Testing/CoverageInfo/*

before each package runs its set of coverage tests?

Or, can we somehow make it so that CTest and/or gcov does not even
touch coverage files that are not in the current Trilinos package
(i.e. subproject or label)?

We need to get this resolved or we just can't support coverage testing
for Trilinos. It took 16 hours to do just 33 Trilinos packages and we
need to run coverage tests on both Serial and MPI builds on upwards of
almost 50 Trilinos packages soon. That mean getting in near 100
package coverage tests done in less that 24 hours.

Note that you can run all of the Trilinos coverage tests with verbose
outputting by using the CTest driver script:

   Trilinos/cmake/ctest/experimental_build_test.cmake

You can run this from *any* build directory by first creating a
CMakeCache.txt file first without any Trilinos packages enabled
(i.e. no Trilinos_ENABLE_XXX flags set) and then run coverage tests
as:

  $ env CTEST_DO_COVERAGE_TESTING=ON CTEST_COVERAGE_COMMAND=/usr/bin/gcov \
      ctest -S $TRILINOS_HOME/cmake/ctest/experimental_build_test.cmake -VV \
      &> ctest.out

That will submit to out Trilinos dashboard and you can debug things on
your side.

In summary, what I am asking for is to:

1) Add duration time output to ctest_xxx(...) commands. That way we
can see what the time is being taken up for sure.

2) Have someone from Kitware run the entire Trilinos coverage test
suite using the experimental_build_test.cmake script metioned above on
your own computer. This way, you can see how this behaves for
yourselves.

3) Find a way to address the problem (either in CTest proper or with
some hack in our CTest driver script).

Thanks,

- Ross
No tags attached.
Issue History
2009-03-08 00:06Roscoe A. BartlettNew Issue
2009-03-08 05:19Roscoe A. BartlettNote Added: 0015597
2009-03-09 11:07Brad KingStatusnew => assigned
2009-03-09 11:07Brad KingAssigned To => Brad King
2009-03-09 11:10Brad KingNote Added: 0015604
2009-03-09 12:20Brad KingNote Added: 0015606
2009-03-11 13:39Brad KingNote Added: 0015652
2012-08-13 10:44Brad KingStatusassigned => backlog
2012-08-13 10:44Brad KingNote Added: 0030550
2016-06-10 14:27Kitware RobotNote Added: 0041512
2016-06-10 14:27Kitware RobotStatusbacklog => resolved
2016-06-10 14:27Kitware RobotResolutionopen => moved
2016-06-10 14:30Kitware RobotStatusresolved => closed

Notes
(0015597)
Roscoe A. Bartlett   
2009-03-08 05:19   
Just to add a note, I ran just the coverage build and testing for the
Trilinos package MOOCHO (which builds very late and depends on lots of
Trilinos packages) and its full build and coverage testing only took
about 15 minutes. The experimental command I ran was:

   env CTEST_BUILD_FLAGS="-j4" Trilinos_PACKAGES="MOOCHO" CTEST_BUILD_NAME=cov-test-2 CTEST_DO_COVERAGE_TESTING=TRUE CTEST_COVERAGE_COMMAND=gcov BUILD_TYPE=DEBUG CTEST_DO_UPDATES=ON time ctest -S ../../../../Trilinos/cmake/ctest/experimental_build_test.cmake -VV 2>&1 | tee ctest.out

The experimental output line on the dashboard was:

  godel.sandia.gov cov-test-2 0 0 0 0 0.1 0 436 8.7 0 0 15 0.2 2009-03-07T22:22:23 MST MOOCHO

Note that just running the tests took 8.7 minutes. That means that
the coverage analysis only took about seven minutes.

This is just showing that building and running the coverage tests for
a package very late in the build order can produce a fast coverage
test.

This suggests that deleteing the coverage output inbetween Trilinos
package builds might fix the performance problem. However, I can't
try this until the memory testing that is running currently is
finished sometime later today.
(0015604)
Brad King   
2009-03-09 11:10   
It looks like CTest is still computing and parsing coverage information globally every time. The only thing the LABELS filter feature I added does is prevent it from *reporting* everything. I'm working to improve the code which runs gcov to only consider coverage files produced for object files built in targets with the proper label.
(0015606)
Brad King   
2009-03-09 12:20   
Okay, now only coverage information from properly labeled targets is loaded.

ENH: Generate a central list of target directories
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.cxx,v <-- Source/CTest/cmCTestCoverageHandler.cxx
new revision: 1.68; previous revision: 1.67
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.h,v <-- Source/CTest/cmCTestCoverageHandler.h
new revision: 1.20; previous revision: 1.19
/cvsroot/CMake/CMake/Source/cmGlobalGenerator.cxx,v <-- Source/cmGlobalGenerator.cxx
new revision: 1.253; previous revision: 1.252
/cvsroot/CMake/CMake/Source/cmGlobalGenerator.h,v <-- Source/cmGlobalGenerator.h
new revision: 1.121; previous revision: 1.120

ENH: Efficiently filter CTest coverage by label
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.cxx,v <-- Source/CTest/cmCTestCoverageHandler.cxx
new revision: 1.69; previous revision: 1.68
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.h,v <-- Source/CTest/cmCTestCoverageHandler.h
new revision: 1.21; previous revision: 1.20
(0015652)
Brad King   
2009-03-11 13:39   
I've just committed two more changes which significantly reduce the time spent uploading coverage results, especially when using label filters.

BUG: Do not carry over file list between coverage
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.cxx,v <-- Source/CTest/cmCTestCoverageHandler.cxx
new revision: 1.71; previous revision: 1.70
/cvsroot/CMake/CMake/Source/cmCTest.h,v <-- Source/cmCTest.h
new revision: 1.116; previous revision: 1.115

BUG: Do not produce empty coverage log files
/cvsroot/CMake/CMake/Source/CTest/cmCTestCoverageHandler.cxx,v <-- Source/CTest/cmCTestCoverageHandler.cxx
new revision: 1.72; previous revision: 1.71
(0030550)
Brad King   
2012-08-13 10:44   
Sending issues I'm not actively working on to the backlog to await someone with time for them.

If an issue you care about is sent to the backlog when you feel it should have been addressed in a different manner, please bring it up on the CMake mailing list for discussion. Sign up for the mailing list here, if you're not already on it:

 http://www.cmake.org/mailman/listinfo/cmake [^]

It's easy to re-activate a bug here if you can find a CMake developer or contributor who has the bandwidth to take it on.
(0041512)
Kitware Robot   
2016-06-10 14:27   
Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.