[Insight-developers] Empty FixedArray destructor: Performancehit using gcc (times 2)
Niels Dekker
niels-xtk at xs4all.nl
Sat Jun 7 11:33:38 EDT 2008
Brad's explanation is a major eye opener to me as well. :-) I never
realized before that there could be such a relation between the byte
alignment of the object allocated by new T[N], and the destructor
T::~T(). Thanks Brad!
Basically the relevant new-operation is at
"itkImportImageContainer.txx", right?
ImportImageContainer::AllocateElements does:
data = new TElement[size];
So removing the user-defined destructor from FixedArray would /only/
solve the issue when TElement (the pixel type) is a FixedArray. And even
in that case, I guess it /only/ solves the issue when the element type
of FixedArray doesn't have a user-defined destructor. Right?
I think the issue /could/ be solved more generally, by avoiding new [],
and using malloc instead. Within
ImportImageContainer::AllocateElements:
data = static_cast<TElement *>(
std::malloc( sizeof(TElement)* size) );
I'm not a fan of using malloc in general, but this /might/ be a valid
reason to do so... Unfortunately std::malloc does not construct the
data elements, so it should be followed by a placement new, to do the
construction in-place:
new ( data ) TElement [size];
If we would do so, each delete [] call in itkImportImageContainer.txx
should be replaced by explicitly destructing the elements of
m_ImportPointer, followed by std::free(m_ImportPointer)...
What do you think?
Kind regards, Niels
P.S. FWIW, Boost's array implementation is somewhat similar to
FixedArray, and doesn't have a user defined destructor.
http://www.boost.org/doc/libs/1_35_0/boost/array.hpp
David Cole wrote:
> Now, the question is: will the GCC developers use extra bytes in
> future releases to record the length of the array to avoid this sort
> of performance hit...? (Matching the alignment that the allocator is
> careful to return would be a good feature for 'new'...)
Brad King wrote:
> Sorry for the delayed response but I've been away from email all
> afternoon. Here is what is going on, using Luis's test case from
> elsewhere in this thread:
>
> class MyArray
> {
> public:
> MyArray() {};
> ~MyArray() {};
> double operator[](unsigned int k) { return foo[k]; }
> double foo[2];
> };
>
> Since double does not *have* to be aligned at 8 bytes on x86 the
> alignment of this struct is 4 bytes...WITH OR WITHOUT the destructor.
> Now consider the code
>
> void* x = new MyArray[2];
>
> It's the address to which 'x' points that changes when the destructor
> is added or removed. The malloc that is done returns an 8-byte
> aligned
> memory buffer in both cases. It's operator new that adjusts it.
>
> When the destructor is present GCC needs a place to record the length
> of the array so that upon deletion it knows how many times the
> destructor needs to be called. Their implementation allocates 4
> additional bytes
> of memory and chooses the first 4 bytes of the resulting buffer to
> store this count so the data are placed starting at 4 bytes past what
> was returned by malloc, and that value is stored in x. Then all the
> doubles get poorly aligned and performance suffers.
>
> When the destructor is not present it does not need to do anything at
> deletion time but free the memory, so no count is needed and the
> buffer returned by malloc is used directly. The doubles end up with
> good alignment.
>
> -Brad
More information about the Insight-developers
mailing list