[Insight-developers] itkTimeStamp Test Failures Mistery

Tue Feb 17 22:47:25 EST 2009

Hi Tom,

Thanks for looking at this.

1) Yeap, I haven't experimented either with removing the
    optimization, to see if this is a "feature" of the
    optimized code. Note that since it shows up in multiple
    platforms, it would be affected the different optimization
    methods that are platform-specific.

2) I see your point regarding the timeIncrement[] array.
    It is really not testing that the tsp Modified time was
    incremented in *this* thread.

3) This still leaves us with the mistery of why some of
    the counts fail.  Notice that the counters[] array,
    still indicates that all the threads have been visited.

I guess I'll try disabling to optimization to see what
happens in that case...

     Thanks

         Luis

----------------------
Tom Vercauteren wrote:
> Hi Luis,
> 
> Even though I am the unfortunate initial author of that time stamp
> test, this bug is definitely out of my c++ expertise.
> 
> That being said, I am not sure that the new timeIncrement array is
> used in a meaningful way. The modified_function of the test is indeed
> called "simultaneously" by many threads. It should therefore only use
> thread safe calls.
> 
> If I get it right the following line is OK because threadId is unique
> for each thread:
>   helper->counters[threadId]++;
> 
> However I don't get the following lines:
>   const signed long time1 = tsp->GetMTime();
>   tsp->Modified();
>   const signed long time2 = tsp->GetMTime();
>   helper->timeIncrement[threadId] = time2 - time1;
> 
> Since other threads might have called tsp->Modified() between the
> allocation of time1 and time2, I see no reason for
> helper->timeIncrement[threadId] to be equal to one.
> 
> 
> Anyhow, it seems that the problem is either in the the timestamp
> itself or in the multithreader.
> 
> When I first saw this bug I was planning to try and use an alternative
> threading framework such as boost thread or a direct use of pthreads.
> 
> The second thing I wanted to try, was a revert to non-optimized
> TimeStamps. Since I committed the test after the optimization, I am
> not even sure that non-optimized TimeStamps don't suffer from the same
> bug. I guess that even though the non-optimized version has that bug,
> it would be far more difficult to highlight because the mutexes lock
> would slow down everything.
> 
> My only fear with the optimized version is that even though the
> increment is atomic (and thus thread safe) the casting assignment
> might be more problematic:
>   m_ModifiedTime = (unsigned long)InterlockedIncrement(&itkTimeStampTime);
> 
> If the problem really comes from here, maybe adding
>   ts.Modified();
> just after
>   multithreader->SingleMethodExecute();
> might make the test pass. This won't of course be a fix for the bug
> but it might help understand what's wrong.
> 
> Unfortunately, I never took the time to actually try all that because
> the test doesn't fail on my machine.
> 
> Hope this helps a little.
> 
> Cheers,
> Tom
> 
> 
> On Mon, Feb 16, 2009 at 20:29, Luis Ibanez <luis.ibanez at kitware.com> wrote:
> 
>>The TimeStamp test continues to mysteriously fail in several
>>platforms, apparently at random occasions.
>>
>>mini3.nlm       MacOSX-gcc4.0-rel    Failed    17.00
>>gingee          Linux-c++            Failed     6.16
>>mini3.nlm       MacOSX-gcc4.0-rel    Failed    18.00
>>zion.kitware    Linux-g++-4.1        Failed    10.34
>>BillsBasement   Linux-gcc43-release  Failed    16.24
>>crl.med.harvard.edu Linux-x86_64-release-gcc  Failed    5.58
>>mini1.nlm       MacOSX-Xcode3.0-dbg           Failed   17.00
>>murron.hobbs-hancock  Linux-gcc-4.3.2-x86_64  Failed  149.23
>>mini2.nlm       MacOSX-Xcode3.0-dbg           Failed   17.00
>>Eternia.kitware gcc_review_optreg_libxml2     Failed    6.46
>>mini3.nlm       MacOSX-gcc4.0-rel             Failed   18.00
>>
>>So here is the mystery:
>>
>>The structure of the test is that it creates one TimeStamp
>>instance.
>>
>>It sets up a function to be called from a large number of
>>threads (128 in some platforms). Each call to the function,
>>triggers an update in the TimeStamp instance.
>>
>>Therefore the expectation of the test is that after the
>>call to SingleExecuteMethod() the Modified time of the
>>TimeStamp will have increased by (at least) the number
>>of threads.
>>
>>The increase may be larger than the number of threads
>>if any other ITK class happens to have an instance of
>>the TimeStamp updated in between...
>>
>>The test assumes that if the difference between the modified
>>time before and after the call to
>>
>>    multithreader->SingleMethodExecute();
>>
>>if less than the number of threads, then the TimeStamp
>>failed to increment its counter in between the round
>>of threaded executions.
>>
>>The platforms above report such occurrences.
>>
>>However,
>>The test also has two arrays with a number of elements
>>equal to the number of threads, and in them we verify
>>that the threaded function has indeed been called for
>>each one of the threads.
>>
>>When the test fails, we still observe that such counters
>>are incremented correctly.
>>
>>The failure cases are behaving "as if" a number of the
>>increments to the time stamp where "delayed" to be
>>executed after the call to SingleMethodExecute().
>>
>>It looks as if the threads were interrupted in the
>>middle of their execution and then allowed to continue.
>>
>>Particularly because, when the test fail, by reporting
>>a default in the TimeStamp increment, the next iteration
>>report and excess by the same amount. E.g. as if the
>>TimeStamp were catching up the missing increments in
>>the next iteration of the test.
>>
>>
>>Any suggestions, hints, advice will be appreciated,
>>
>>
>>    Thanks
>>
>>
>>       Luis
>>
>>
>>_______________________________________________
>>Powered by www.kitware.com
>>
>>Visit other Kitware open-source projects at
>>http://www.kitware.com/opensource/opensource.html
>>
>>Please keep messages on-topic and check the ITK FAQ at:
>>http://www.itk.org/Wiki/ITK_FAQ
>>
>>Follow this link to subscribe/unsubscribe:
>>http://www.itk.org/mailman/listinfo/insight-developers
>>
> 
>