[Insight-developers] itkTimeStamp Test Failures Mistery

Mon Feb 16 14:29:14 EST 2009

The TimeStamp test continues to mysteriously fail in several
platforms, apparently at random occasions.

mini3.nlm       MacOSX-gcc4.0-rel    Failed    17.00
gingee          Linux-c++            Failed     6.16
mini3.nlm       MacOSX-gcc4.0-rel    Failed    18.00
zion.kitware    Linux-g++-4.1        Failed    10.34
BillsBasement   Linux-gcc43-release  Failed    16.24
crl.med.harvard.edu Linux-x86_64-release-gcc  Failed    5.58
mini1.nlm       MacOSX-Xcode3.0-dbg           Failed   17.00
murron.hobbs-hancock  Linux-gcc-4.3.2-x86_64  Failed  149.23
mini2.nlm       MacOSX-Xcode3.0-dbg           Failed   17.00
Eternia.kitware gcc_review_optreg_libxml2     Failed    6.46
mini3.nlm       MacOSX-gcc4.0-rel             Failed   18.00

So here is the mystery:

The structure of the test is that it creates one TimeStamp
instance.

It sets up a function to be called from a large number of
threads (128 in some platforms). Each call to the function,
triggers an update in the TimeStamp instance.

Therefore the expectation of the test is that after the
call to SingleExecuteMethod() the Modified time of the
TimeStamp will have increased by (at least) the number
of threads.

The increase may be larger than the number of threads
if any other ITK class happens to have an instance of
the TimeStamp updated in between...

The test assumes that if the difference between the modified
time before and after the call to

      multithreader->SingleMethodExecute();

if less than the number of threads, then the TimeStamp
failed to increment its counter in between the round
of threaded executions.

The platforms above report such occurrences.

However,
The test also has two arrays with a number of elements
equal to the number of threads, and in them we verify
that the threaded function has indeed been called for
each one of the threads.

When the test fails, we still observe that such counters
are incremented correctly.

The failure cases are behaving "as if" a number of the
increments to the time stamp where "delayed" to be
executed after the call to SingleMethodExecute().

It looks as if the threads were interrupted in the
middle of their execution and then allowed to continue.

Particularly because, when the test fail, by reporting
a default in the TimeStamp increment, the next iteration
report and excess by the same amount. E.g. as if the
TimeStamp were catching up the missing increments in
the next iteration of the test.

Any suggestions, hints, advice will be appreciated,

      Thanks

         Luis