[Cdash] Report of failed time status
David Cole
david.cole at kitware.com
Tue Jul 5 11:07:58 UTC 2011
On Mon, Jul 4, 2011 at 7:17 AM, Julien Jomier <julien.jomier at kitware.com>wrote:
> Hi Olivier,
>
>
> 1°) Why not considering CPU times instead of Wall-clock times ? Is it
>> really difficult to implement (maybe more related to CTest than CDash) ?
>>
>
> This is related to CTest. I'll let Dave or Zach (in CC) comment on this.
We measure wall time because we can.
We could consider measuring the CPU time when running tests with ctest, but
what we would need to implement that successfully is code that works on all
the platforms where ctest currently works that measures that for us. I'm not
aware of any such code that works on Linux, Mac, Windows and all the Unix
flavors where ctest presently works. I'm sure it could be developed using
platform specific techniques and ifdefs, but it's not there right now.
However, as ctest is part of an open source project, with contributors from
all around the world ... we would welcome a contribution like that, if one
should become available.
Thanks,
David C.
>
> 2°) Let's consider we have a 'Test time # max failures before flag =
>> 3'. On the third day with higher test times, are the two first ones
>> with test time failures considered in the average ? I hope not because
>> with a coefficient of 0.3 in the average, times before three days ago
>> are almost negligible. I already noticed that test failures not due to
>> time are not taken into account into the average.
>>
>
> No the previous failed status are not considered in the average.
>
>
> 3°) From documentation on vtk website:
>>
>> A test is defined as failing if it verifies the following: if previousSD
>> < thresholdSD then previousSD = thresholdSD.
>>
>> if currentTime> previousMean+multiplier***previousSD.
>>
>>
>> In my case, with following parameters:
>> Test time SD (coefficient): 4.0
>> Test time SD threshold : 1
>> Test time # max failure before flag : 1
>>
>> And for a given test (reported on CDash - testcase report - exucution
>> time(s) line):
>> - mean:31.29
>> - std:2.72
>> - Execution time : 36.26
>>
>> => previous SD> threshold => threshold not taken into account
>> => previous mean + multiplier*previous SD = 31.29 + 4*2.72 = 42.17>
>> 36.26
>>
>> ==> Test time should be OK but is reported as failed and flagged on main
>> CDash page !??
>> Maybe reported mean and std are not the previous ones but current
>> ones. If I go on the previous report, it's reported mean 29.16 -
>> std:0.0. Maybe this previous 0.0 is used. But clearly on test times
>> graph, there is a standard deviation (oscillates between 29 and 41
>> during last month).
>>
>
> The average/SD are not recomputed when you change the threshold. From the
> question 2) you see that if before the threshold was not meeting the
> requirement, the average and SD were never computed based on historical
> values. This is probably why you got std:0.0. You should wait a couple of
> days and see if that helps.
>
>
> 4°) html link in project configuration in testing tab - in description
>> of 'test standard deviation' and 'test standard deviation threshold',
>> link to test timing description on the WIKI is wrong; maybe due to the
>> fact that I'm using CDash 1.6.2
>>
>
> I added this in the bug tracker. Thanks for the report.
>
> Julien
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cdash/attachments/20110705/afce1e77/attachment-0002.htm>
More information about the CDash
mailing list