[Insight-developers] Empty FixedArray destructor: Performance hit using gcc (times 2) : __attribute__ ((aligned (8)))
Tom Vercauteren
tom.vercauteren at m4x.org
Fri Jun 6 10:23:10 EDT 2008
Hi Luis,
> Trying to understand why this alignment happens, we have reduced
> the test to minimalistic implementation of FixedArray:
>
> a) When the destructor exists, an array of MyArray(s) is
> allocated in a 4byte boundary
>
> b) When the destructor does not exists, an array of
> MyArray(s) is allocated in a 8byte boundary
>
>
> Then, by Googling about it we found this GCC flag:
>
> -malign-double
>
> When compiling with this flag, your test is always aligned
> to 8 bytes, regardless of whether the destructor is present
> or not.
Thanks for this interesting observation and for the compilation flag hint.
> We still have not answered the fundamental question:
>
> Why is that the presence of a non-virtual destructor
> changes the alignment ?
I do agree that this should be the right question.
> The Attribute:
> __attribute__ ((aligned (8)))
> also does the trick.
Thanks again!
> we can compile without -malign-double and the structure
> is still aligned to 8 bytes, despites the fact that the
> destructor is sill present.
>
>
> We could create ITK macros for this attribute options,
> and define the macros at configuration time by using
> TRY_COMPILES.
>
>
> At first sight, it is much better than the global
> -malign-double option, and we can apply it only
> to structures that we know must be aligned.
>
>
> One challenge here is that although we want FixedArray<double,N>
> to be 8-bytes aligned, we don't always want the FixedArray<T,N>
> to be aligned this way. For example: FixedArray<char,3> ??
>
>
> One option could be to create your pixel type as a class derivied
> from FixedArray<double,2>, and see if we can apply the attribute
> just to the derived class....
>
> In this way, this will be an application specific issue, as opposed
> to something that has to be done pervasively in ITK.
Well I would really vote for removing the empty destructor
implementation. This seems to be the cleanest way to solve the
problem. The empty destructor is useless and only interferes here. I
don't really like changing compilation flags or adding macros and
adding a new derived class would be confusing for the user.
Tom
More information about the Insight-developers
mailing list