[Insight-developers] itkTimeStamp Test Failures Mistery
Luis Ibanez
luis.ibanez at kitware.com
Tue Feb 17 22:47:25 EST 2009
Hi Tom,
Thanks for looking at this.
1) Yeap, I haven't experimented either with removing the
optimization, to see if this is a "feature" of the
optimized code. Note that since it shows up in multiple
platforms, it would be affected the different optimization
methods that are platform-specific.
2) I see your point regarding the timeIncrement[] array.
It is really not testing that the tsp Modified time was
incremented in *this* thread.
3) This still leaves us with the mistery of why some of
the counts fail. Notice that the counters[] array,
still indicates that all the threads have been visited.
I guess I'll try disabling to optimization to see what
happens in that case...
Thanks
Luis
----------------------
Tom Vercauteren wrote:
> Hi Luis,
>
> Even though I am the unfortunate initial author of that time stamp
> test, this bug is definitely out of my c++ expertise.
>
> That being said, I am not sure that the new timeIncrement array is
> used in a meaningful way. The modified_function of the test is indeed
> called "simultaneously" by many threads. It should therefore only use
> thread safe calls.
>
> If I get it right the following line is OK because threadId is unique
> for each thread:
> helper->counters[threadId]++;
>
> However I don't get the following lines:
> const signed long time1 = tsp->GetMTime();
> tsp->Modified();
> const signed long time2 = tsp->GetMTime();
> helper->timeIncrement[threadId] = time2 - time1;
>
> Since other threads might have called tsp->Modified() between the
> allocation of time1 and time2, I see no reason for
> helper->timeIncrement[threadId] to be equal to one.
>
>
> Anyhow, it seems that the problem is either in the the timestamp
> itself or in the multithreader.
>
> When I first saw this bug I was planning to try and use an alternative
> threading framework such as boost thread or a direct use of pthreads.
>
> The second thing I wanted to try, was a revert to non-optimized
> TimeStamps. Since I committed the test after the optimization, I am
> not even sure that non-optimized TimeStamps don't suffer from the same
> bug. I guess that even though the non-optimized version has that bug,
> it would be far more difficult to highlight because the mutexes lock
> would slow down everything.
>
> My only fear with the optimized version is that even though the
> increment is atomic (and thus thread safe) the casting assignment
> might be more problematic:
> m_ModifiedTime = (unsigned long)InterlockedIncrement(&itkTimeStampTime);
>
> If the problem really comes from here, maybe adding
> ts.Modified();
> just after
> multithreader->SingleMethodExecute();
> might make the test pass. This won't of course be a fix for the bug
> but it might help understand what's wrong.
>
> Unfortunately, I never took the time to actually try all that because
> the test doesn't fail on my machine.
>
> Hope this helps a little.
>
> Cheers,
> Tom
>
>
> On Mon, Feb 16, 2009 at 20:29, Luis Ibanez <luis.ibanez at kitware.com> wrote:
>
>>The TimeStamp test continues to mysteriously fail in several
>>platforms, apparently at random occasions.
>>
>>mini3.nlm MacOSX-gcc4.0-rel Failed 17.00
>>gingee Linux-c++ Failed 6.16
>>mini3.nlm MacOSX-gcc4.0-rel Failed 18.00
>>zion.kitware Linux-g++-4.1 Failed 10.34
>>BillsBasement Linux-gcc43-release Failed 16.24
>>crl.med.harvard.edu Linux-x86_64-release-gcc Failed 5.58
>>mini1.nlm MacOSX-Xcode3.0-dbg Failed 17.00
>>murron.hobbs-hancock Linux-gcc-4.3.2-x86_64 Failed 149.23
>>mini2.nlm MacOSX-Xcode3.0-dbg Failed 17.00
>>Eternia.kitware gcc_review_optreg_libxml2 Failed 6.46
>>mini3.nlm MacOSX-gcc4.0-rel Failed 18.00
>>
>>So here is the mystery:
>>
>>The structure of the test is that it creates one TimeStamp
>>instance.
>>
>>It sets up a function to be called from a large number of
>>threads (128 in some platforms). Each call to the function,
>>triggers an update in the TimeStamp instance.
>>
>>Therefore the expectation of the test is that after the
>>call to SingleExecuteMethod() the Modified time of the
>>TimeStamp will have increased by (at least) the number
>>of threads.
>>
>>The increase may be larger than the number of threads
>>if any other ITK class happens to have an instance of
>>the TimeStamp updated in between...
>>
>>The test assumes that if the difference between the modified
>>time before and after the call to
>>
>> multithreader->SingleMethodExecute();
>>
>>if less than the number of threads, then the TimeStamp
>>failed to increment its counter in between the round
>>of threaded executions.
>>
>>The platforms above report such occurrences.
>>
>>However,
>>The test also has two arrays with a number of elements
>>equal to the number of threads, and in them we verify
>>that the threaded function has indeed been called for
>>each one of the threads.
>>
>>When the test fails, we still observe that such counters
>>are incremented correctly.
>>
>>The failure cases are behaving "as if" a number of the
>>increments to the time stamp where "delayed" to be
>>executed after the call to SingleMethodExecute().
>>
>>It looks as if the threads were interrupted in the
>>middle of their execution and then allowed to continue.
>>
>>Particularly because, when the test fail, by reporting
>>a default in the TimeStamp increment, the next iteration
>>report and excess by the same amount. E.g. as if the
>>TimeStamp were catching up the missing increments in
>>the next iteration of the test.
>>
>>
>>Any suggestions, hints, advice will be appreciated,
>>
>>
>> Thanks
>>
>>
>> Luis
>>
>>
>>_______________________________________________
>>Powered by www.kitware.com
>>
>>Visit other Kitware open-source projects at
>>http://www.kitware.com/opensource/opensource.html
>>
>>Please keep messages on-topic and check the ITK FAQ at:
>>http://www.itk.org/Wiki/ITK_FAQ
>>
>>Follow this link to subscribe/unsubscribe:
>>http://www.itk.org/mailman/listinfo/insight-developers
>>
>
>
More information about the Insight-developers
mailing list