[CDash] Problems with 2.4.0 upgrade
glhenni at sandia.gov
Fri Oct 19 16:12:42 UTC 2018
We've been using cdash 2.2.2 for quite a while but the server running it
is going to be decommissioned so I thought I'd take the opportunity to
upgrade to 2.4.0 (although it says 2.5.0 on the bottom of the
dashboard). So I clone the repository and switch to the
/v2.4.0-prebuilt/ tag and point the webserver right to the clone. I've
got two problems I've been fighting since the install...
Problem one, last night all our regression tests completed successfully
and all submitted to cdash but one of the expected builds is absent from
the database. The submission reported as successfully completed by the
test execution platform, I can see, in the job's log, the successful
submission of the XML files to my cdash site, I can see the submitted
files associated with the job in the /CDash/backup/ directory, but I see
nothing for that job in the database and therefore, of course, the
dashboard is showing an empty line for that particular test with
/Expected build/ in the *Start Time* column. It appears the job is being
successfully submitted but not inserted into the database.
Problem two, false negatives in email. The first time this happened I
had inadvertently enabled test timing and since it was starting from
scratch the times were varying. Just looking at the dashboard there was
no indication of this until I turned on the /Advanced View. /Then I
could see a red column for timings and I figured out what was going on
and disabled test timing. However, that didn't completely eliminate the
problem. Again, last night all our regression tests passed and aside
from the aforementioned missing build in the database, all the columns
are green. But I got the following email:
A submission to CDash for the project Charon has failing tests.
You have been identified as one of the authors who have checked in changes that are part of this submission or you are listed in the default contact list.
Details on the submission can be found athttp://cdashmachine.my.url/cdash/buildSummary.php?buildid=252
Site: CEE Build Farm
Build Name: Linux-x86_64-RHEL_6.x-SEMS_OpenMPI_1.10_Intel_17.x-DBG
Build Time: 2018-10-19T01:00:16 UTC
Tests failing: 3
-CDash on cdashmachine.my.url
I follow the link and I see:
Stage Errors Warnings
<http://verne.srn.sandia.gov/cdash/viewUpdate.php?buildid=252> * *0
<http://verne.srn.sandia.gov/cdash/viewConfigure.php?buildid=252> * *0
<http://verne.srn.sandia.gov/cdash/viewBuildError.php?buildid=252> * *0
So those 3 failing tests are showing up here. Now if I look at the
bottom of the page, and I select "View Tests Summary" I get a clue
*107 passed, 0 failed, 3 timed out, 0 not run.
*I don't know where that /"3 timed out"/ is coming from and they don't
show up in the table associated with this test summary below that. This
test suite does have 110 tests, but 3 of them are excluded from this
particular platform/compiler combination via a "-LE debugexclude" and
tagging those three tests with a "debugexclude". I can't for the life of
me figure out why it thinks three tests have "timed out".
One thing that could be an issue, maybe related to both my problems, is
uniqueness. Does "BuildName" have to be unique? Typically, in our case,
only the combination of "BuildName" and "Name" are unique. Also we use
jenkins for some of our testing and the platform/site that those tests
run on isn't stable. Each night the tests may run on a different machine
in the jenkins build farm. In those tests we set the value of "Name" by
using SET(CTEST_SITE <name>). Maybe I'm just grasping at straws here.
Lastly, almost nothing has changed in how we're doing things. I simply
changed our submissions from the old 2.2.2 CDash, which was working
fine, to my new install of 2.4.0 on a new platform.
Hopefully someone can offer some suggestions on my likely overly
detailed description, or at least how to start to debug it.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CDash