[ITK-dev] [ITK] About the last optimization on VariableLengthVector

Matt McCormick matt.mccormick at kitware.com
Mon Nov 16 13:44:28 EST 2015


On Mon, Nov 16, 2015 at 8:55 AM, Luc Hermitte <luc.hermitte at c-s.fr> wrote:

>>> Third Problem: Local Variables
>>> 3.1- Auxiliary variables
>>> Some algorithms (e.g. BSplines interpolations) require dynamic data
>>> structures like matrices. This implies allocation at each iteration.
>>> This particular problem isn't addressed in this patch. The solution to
>>> 3.2 will help to provide an solution.
>>>
>>> 3.2- Factorized computations
>>> [...]
>>>
>>> We need a C++98 solution.
>>> The only way I see is to rely on cached variables.
>>> This means that the ImageFunctions are no longer [[pure]] functions :
>>> they'll mutate some variables that we want to allocate once per
>>> ImageFunction and ... per thread.
>>> Indeed the ImageFunctions shall be reentrant.
>>>
>>> I was first hoping to use thread_local variables. Alas they cannot be
>>> non-static attributes. This is a problem: if I'm not wrong, a same
>>> ImageFunction could be used multiple times in a same processing chain.
>>>
>>> We still have a simple solution, but it requires a second modification
>>> to Evaluate() signature: we need to pass the current ThreadId.
>>>
>>> The new drawback: code migration requires more work. We can not write
>>> Evaluate() in terms of EvaluateNoCopy() as the latter requires the
>>> threadId that the former doesn't know. This is the reason why I expect
>>> the tests to fail.
>>> Instead we need to convert all client code (like the WarpImageFilter)
>>> and have the default implementation of EvaluateNoCopy() to call the
>>> already written and not yet converted Evalutate().
>>
>> This would be messy, and it would not work very well since the
>> ImageFunction's are not involved in orchestration of the threading.
>> Instead, it would be better to make the necessary auxiliary variables
>> private variables.  Parallel code that uses ImageFunction's should
>> create one ImageFunction instance per thread.
>
> I'm afraid that with this choice, code that currently works well in
> parallel won't work anymore. End users will need to rewrite all uses of
> image functions in order to use them in MT code. And I guess we will
> also need to rewrite several test cases.
>
> Unless we duplicate the Evaluate() code : i.e. unless we keep the
> current version, and we add the new and optimized rewrite (for VLV in
> most cases, or for every kind of variables in some other cases (e.g.
> BSpline interpolation where caching matrix allocations helps reduce
> computation time))
> Duplicating image function algorithms doesn't seem to be a good choice
> either.

Right. Yes, this will not be maintainable.

A feasible way forward seems to use a C++11 code path when available
that has enhanced performance (i.e., use auto with C++11).  The C++98
version will not be as fast, but it will work. ITK leans towards
readability, maintainability, and program-ability over performance in
cases like this.


>>> Fourth Problem: heterogeneous data types (i.e. castings)
>>> The input image type, the output image type, and the internal pixel type
>>> may all be different. That's why the code contain so many static_casts.
>>>
>>> Alas a static_cast<> on pixels will induce the creation of new pixel values.
>>> This is contrary to the purpose of this patch.
>>> Copy constructors and assignment operators already support on-the-fly
>>> implicit static_casting.
>>> We still need to be able to require casting only when it's required.
>>> That's where the new function CastInto(), MoveInto() and As() come into
>>> rescue.
>>
>> So compiler's are not intelligent to make a static_cast<> a no-op when
>> the types are the same?
>
> Unfortunately, it doesn't seem so. Check
> http://stackoverflow.com/questions/19106826/can-static-cast-to-same-type-introduce-runtime-overhead

Thanks for the link. That thread seems to indicate that POD types are
not an issue, and non-POD types may not be an issue outside of -O0.


> Moreover, in the case of converting VLV<double> to VLV<int>, with a
> static_cast, we'll force the creation of a new variable which is what I
> striving to avoid.
>
> Hence the new helper functions that'll use the converting assignment
> operators in case of VLV, or static_cast in all other cases.

Sounds reasonable.

What are the different definitions and uses for  CastInto(),
MoveInto(), and As()?


Thanks,
Matt


More information about the Insight-developers mailing list