[Insight-developers] Empty FixedArray destructor: Performance hit using gcc (times 2)
Tom Vercauteren
tom.vercauteren at m4x.org
Thu Jun 5 10:29:05 EDT 2008
Hi,
I think that your current test is not actually checking the right
stuff. Was is really slowing things down is when we iterate over the
array. I'll try to send a new test soon but need to finish something
before...
BTW, we were compiling in RelWithDebInfo.
Tom
On Thu, Jun 5, 2008 at 4:24 PM, Luis Ibanez <luis.ibanez at kitware.com> wrote:
>
> Hi Gert,
>
> Thanks for the quick report !
>
> It makes sense that -g flag will prevent the method
> from being optimized away.
>
> If you have a chance,
> could you please test what happens when no -g is
> used, and the optimization flag is set to -O3 ?
>
> Those will be the flags that we set when
> CMAKE_BUILD_TYPE is set to "Release".
>
>
> Thanks for any feedback,
>
>
> Luis
>
>
> -------------------
> Gert Wollny wrote:
>>>
>>> Here is a suggested skeleton for the test:
>>>
>>> ---------------------------------------------
>>> #include "itkFixedArray.h"
>>> #include "itkTimeProbesCollectorBase.h"
>>>
>>> int main()
>>> {
>>>
>>> typedef itk::FixedArray< double, 2 > ArrayType;
>>>
>>> ArrayType foo[10];
>>> ArrayType * p = foo;
>>> ArrayType * q = foo;
>>> p++;
>>>
>>> char * cp = (char*)( p );
>>> char * cq = (char*)( q );
>>>
>>> std::cout << "Type size =" << sizeof( ArrayType ) << std::endl;
>>> std::cout << "Pointer step = " << int( cp - cq ) << std::endl;
>>>
>>> itk::TimeProbesCollectorBase chronometer;
>>>
>>> chronometer.Start("FixedArray");
>>> for(unsigned long t=0; t<1000000L; t++)
>>> {
>>> ArrayType foo;
>>> foo[0] = t;
>>> }
>>> chronometer.Stop("FixedArray");
>>>
>>> chronometer.Report(std::cout);
>>>
>>> return 0;
>>> }
>>>
>>> ---------------------------------------------
>>>
>>
>> I've did a test with g++ 4.2.3 (Ubuntu)
>> using compile flags "-g -O2". If the destructor is implemented, it is
>> called (of course), if there is no explicit implementation of the
>> destructor, the call to the destructor is optimized away resulting in a ~30%
>> speedup of above loop. In both cases alignment is 16.
>>
>> Best
>>
>> Gert
>>
>>
>>
>
More information about the Insight-developers
mailing list