[Insight-developers] Empty FixedArray destructor: Performance hit using gcc (times 2)

Luis Ibanez luis.ibanez at kitware.com
Thu Jun 5 09:15:54 EDT 2008


Hi Tom,

               Thanks for reporting this.


Just being superstisious here...
It doesn't seems to be a good idea to make this change until we
understand why this is happening, and whether is releveant to
other platforms besides Mac-gcc.


Let's make a test for these two particular behaviors:

         1) Check for byte alignment
         2) Run time performance.

We will commit this test and then we can (locally) check the effect
of removing the FixedArray destructor in different platforms.


Here is a suggested skeleton for the test:

---------------------------------------------
#include "itkFixedArray.h"
#include "itkTimeProbesCollectorBase.h"

int main()
{

typedef itk::FixedArray< double, 2 >   ArrayType;

ArrayType foo[10];
ArrayType * p = foo;
ArrayType * q = foo;
p++;

char * cp = (char*)( p );
char * cq = (char*)( q );

std::cout << "Type size  =" << sizeof( ArrayType ) << std::endl;
std::cout << "Pointer step = " << int( cp - cq ) << std::endl;

itk::TimeProbesCollectorBase chronometer;

chronometer.Start("FixedArray");
for(unsigned long t=0; t<1000000L; t++)
   {
   ArrayType foo;
   foo[0] = t;
   }
chronometer.Stop("FixedArray");

chronometer.Report(std::cout);

return 0;
}

---------------------------------------------


Please let us know if you think that this test will should capture
both the the byte alignment and the performance behavior or not.
In the second case, please help us convert the test to the right
method of inspection.

I ran it in an Ubuntu Linux machine with gcc 4.1.2, and the
pointer differences were always 16. (Actually I don't see
how an array of FixedArray<double,2> could be aligned to 4bytes
and not to 8bytes.  I'm probably missing some basic fact here...

Could you please elaborate on this ?


The performance results were inconclusive:

a) When the destructor exists the for-loop took about 40 milliseconds.

b) When the destructor is removed, it took sometimes 60 ms and 30ms
in others.


Could we rework this simple test into something that can be committed
to the repository as a standard extra test for the FixedArray ?



BTW: Were you compiling for "Release" (-O3) ? or "Debug" (-g) ?
      or using the default settings ?


The test reported above in Ubuntu were with default settings
(no Release, no Debug)


    Thanks


       Luis


----------------------
Tom Vercauteren wrote:
> Hi all,
> 
> After some code profiling, we stumbled on a mysterious performance hit
> related to the empty destructor used in itk::FixedArray. The
> explanation of it is not really clear yet but the fix is easy: Do not
> implement the destructor. I'll be happy to commit the change if nobody
> is against it.
> 
> Here is the setting. We create an image of the following type:
>    itk::Image<  itk::Vector< double, 2 >, 2 > image;
> 
> 
> On the platforms we use (gcc 4.1 for linux or gcc 4.0 for mac), we realized that
>    image->GetBufferPointer()
> was aligned on 4 bytes
> 
> 
> Now by removing the empty implementation of ~FixedArray() we saw that
>    image->GetBufferPointer()
> was now aligned on 8 bytes
> 
> 
> In the (home-made) filter we use, the execution time with the current
> implementation is twice as long as the one with an empty FixedArray
> destructor.
> 
> Note that since ~FixedArray() is empty and not virtual, removing its
> implementation should not lead to any backward compatibility issue.
> 
> Let me know if you need further information, if you have a good
> explanation for this behavior or if you know a better solution.
> 
> Best,
> Tom Vercauteren
> _______________________________________________
> Insight-developers mailing list
> Insight-developers at itk.org
> http://www.itk.org/mailman/listinfo/insight-developers
> 


More information about the Insight-developers mailing list