[Insight-developers] itk performance numbers
Rupert Brooks
rupert.brooks at gmail.com
Mon Jul 30 16:44:04 EDT 2012
Yes, thats how i remember it too.
Mind you, ITK3.20 was not thread-unsafe. The solution that was done in
itk3.20 is that the transforms were not considered threadsafe, so they were
duplicated across threads thus duplicating the Jacobian. Oddly enough,
they still are duplicated in itk4. I dont know if thats an oversight or
intentional.
Rupert
--------------------------------------------------------------
Rupert Brooks
rupert.brooks at gmail.com
On Mon, Jul 30, 2012 at 1:57 PM, Michael Stauffer <mstauff at verizon.net>wrote:
> Rupert,
>
> The Jacobian in 3.2 was a base class member, but was moved it out because
> it wasn't thread-safe to share a class object between threads - assuming I
> remember correctly!
>
> -M
>
>
> On 07/28/12, Rupert Brooks<rupert.brooks at gmail.com> wrote:
>
> Brad,
>
> So Mattes MI metric, single threaded, was 4 times slower in 4.2, and after
> the patch became only 2 times slower. Egad - thats a big difference - i
> dont know about you but i am quite surprised. I wasnt using this metric so
> i hadn't noticed. On friday I went compared the files from 3.20 -> 4.2
> briefly, but I didnt see where the slowdown would be.
>
> It does not seem to be in the interpolators - i made an interpolator
> benchmark and it seems to run as good or better in 4.2. I split itkbench
> into multiple files for each benchmark to try to stay organized. I'll try
> to reproduce your result from your fork and pull it in.
>
> In any case, i think your approach to handling the Jacobian is good.
> Since the Jacobian is used by nearly all metrics, i wonder if it should be
> a member of the base class and allocated there?
>
> Cheers,
> Rupert
>
> --------------------------------------------------------------
> Rupert Brooks
> rupert.brooks at gmail.com
>
>
>
> On Fri, Jul 27, 2012 at 9:59 AM, Bradley Lowekamp <blowekamp at mail.nih.gov>wrote:
>
>> OK here are some more numbers for the latest patch in gerrit. I will
>> follow Ruperts format as it's the most clear.
>>
>> MeanSquares:
>> Threads 3.2 4.2 4.2+patch patch percentage of 3.20
>> 1 0.3615 0.8214 0.4071 113%
>> 2 0.3222 0.6055 0.3365 104%
>> 4 0.3249 0.4448 0.3293 101%
>> 8 0.1703 0.3093 0.1943 114%
>> 12 0.1457 0.2031 0.1322 91%
>> 24* 0.1062 0.1332 0.0949 89%
>>
>> MutualInformation:
>> Threads 3.2 4.2 4.2+patch patch percentage of 3.20
>> 1 0.1467 0.6103 0.3353 228%
>> 2 0.1036 0.3747 0.1774 171%
>> 4 0.0847 0.2175 0.1262 149%
>> 8 0.0655 0.1291 0.0681 104%
>> 12 0.0551 0.1035 0.0486 88%
>> 24* 0.0460 0.0829 0.0526 114%
>>
>> *Hyperthreading
>>
>> The observation to be made about MutualInformation is that while 4.2 it's
>> still slower with one thread, there is a significant increase is speed-up
>> due to threads now.
>>
>> Brad
>>
>> On Jul 26, 2012, at 2:02 PM, Rupert Brooks wrote:
>>
>> Ok that makes way more sense, sorry i didnt understand first time around.
>>
>> Just so i've got it right
>> Threads 3.20 4.2+patch
>> Time 4.2 as percent of 3.20
>> 1 0.347567 0.383342
>> 110.293%
>> 2 0.300869 0.335328
>> 111.453
>> 4 0.348677 0.315688
>> 90.5388
>> 8 0.182681 0.192132
>> 105.173
>>
>> So theres about 10% more time with ITK 4.2 used in the 1 and 2 thread
>> case. That is definitely better than what we were getting. Cool.
>>
>> Rupert
>>
>> --------------------------------------------------------------
>> Rupert Brooks
>> rupert.brooks at gmail.com
>>
>>
>>
>> On Thu, Jul 26, 2012 at 1:13 PM, Bradley Lowekamp <blowekamp at mail.nih.gov
>> > wrote:
>>
>>> Sorry for not being clear! I got too excited by finding the solution to
>>> the performance issue with ITKv3 registration in ITKv4.
>>>
>>> This first is vanilla 3.20, the second is 4.20+ the gerrit patch. The
>>> third is the gerrit patch with the pre-malloc of the Jacobin outside the
>>> threaded section! Vanilla 4.2 is ~2x 3.20 for this test on my system too.
>>>
>>> Summary for the MeansSquares metric in your test:
>>>
>>> 3.20: 1X
>>> 4.2: 2+X
>>> 4.2+gerrit patch: 1X
>>> 4.2+gerrit patch + single-threaded preallocation of jacobian: 1.5X
>>>
>>
>> ========================================================
>>
>> Bradley Lowekamp
>>
>> Medical Science and Computing for
>>
>> Office of High Performance Computing and Communications
>>
>> National Library of Medicine
>>
>> blowekamp at mail.nih.gov
>>
>>
>>
>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Kitware offers ITK Training Courses, for more information visit:
>> http://kitware.com/products/protraining.php
>>
>> Please keep messages on-topic and check the ITK FAQ at:
>> http://www.itk.org/Wiki/ITK_FAQ
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.itk.org/mailman/listinfo/insight-developers
>>
>>
>
> ------------------------------
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
>
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/pipermail/insight-developers/attachments/20120730/bd23aac9/attachment.htm>
More information about the Insight-developers
mailing list