[Insight-developers] vnl updates : Rounding Test : Performance and Correctness
Luis Ibanez
luis.ibanez at kitware.com
Wed Dec 31 09:44:00 EST 2008
Hi Tom,
Thanks for all this additional information.
This is indeed very useful.
Please see comments below.
Thanks
Luis
-----------------------
Tom Vercauteren wrote:
> Hey,
>
> First of all, I apologize for not giving enough background on this
> issue. The fact is that I had already spent a lot of time with this on
> the vxl side. See e.g.
> http://sourceforge.net/mailarchive/forum.php?thread_name=28392e8b0811190226i20ab46a5m2a72a5d9518ce557%40mail.gmail.com&forum_name=vxl-maintainers
>
Thanks for pointing out to this discussion in the VNL list.
Have the issue of SSE2 been resolved ?
>
> 1) Let me first discuss the performance side.
>
> According to my timings shown here
> http://sourceforge.net/mailarchive/forum.php?thread_name=28392e8b0811270614o7f042cf7o5828d2b31c52c88b%40mail.gmail.com&forum_name=vxl-maintainers
> in release mode, the performance improvement is rather a factor 3 or 4
> between an optimized version and the basic if version:
> time_ref / time_sse : 4.37907
> time_ref / time_asm : 3.86511
>
> This is important for image registration as this may lead to a 10%
> improvement on the entire registration.
>
An improvement of 10% in registration time sound modest
at the price of having to deal with non-portable code.
There are many other places in an image registration
problem where we could squeeze 10% out. From number of
samples, to number of iterations, to step lengths (or
its equivalent) in the optimizer...
In the context of medical imaging, the capability of
getting the same answer on every machine is a lot more
important than saving 10% on the execution of a registration.
> Note that timing this is not trivial since the compiler may optimize
> many things out if they are not used in the test program. I provided a
> basic timing test here:
> http://sourceforge.net/mailarchive/attachment.php?list_name=vxl-maintainers&message_id=28392e8b0811270614o7f042cf7o5828d2b31c52c88b%40mail.gmail.com&counter=1
>
>
We have numbers today from the ITK dashboard
http://www.cdash.org/CDash/testSummary.php?project=2&name=itkVNLRoundProfileTest1&date=2008-12-31
Which also illustrate the platform dependency
that you pointed out.
> 2) About correctness.
>
> The implementation of vnl_math_round shipped with ITK is already
> plateform dependent. Here is a snippet that contains only the double
> version:
> #if VXL_C_MATH_HAS_LROUND
> inline int vnl_math_rnd(double x) { return static_cast<int>(lround(x)); }
> #elif defined (VCL_VC) && !defined(__GCCXML__) && !defined(_WIN64)
> __inline int
> vnl_math_rnd (double flt)
> { int intgr;
> _asm
> { fld flt
> fistp intgr
> } ;
> return intgr ;
> }
> #else
> inline int vnl_math_rnd(double x) { return (x>=0.0)?(int)(x +
> 0.5):(int)(x - 0.5); }
> #endif
>
> You can see that in the #elif block, the operation for half integers
> will be round to nearest even. In the other cases, it will be round
> away from zero.
This is a problem.
We shouldn't have this platform dependency.
>
> The problem with this implementation is that on many plateforms
> including mine, lround is actually slower than the basic if
> implementation ( time_ref / time_lround : 0.657615 ). This is why I
> decided to reimplement vnl_round. Since I was focused on performance
> and vnl_math_rnd did not enforce any specific rounding scheme for
> half-integer, I simply chose to implement the fastest version.
>
You bring a good point in your discussion on the VNL list.
We should deal not only with 'round' but also with 'ceil'
and 'floor'.
> That being said, it is possible to implement a fast rounding scheme
> that rounds half-integer away from zero. The limitations are:
> - Only works within INT_MIN/2 – 1 and INT_MAX/2 + 1
> - Assumes that the FPU rounding mode has not been changed from the
> default one (same as the asm version above)
>
> It could thus be possible to have two version of vnl_math_round (one
> that rounds away from zero and one that doesn't care about
> hal-integers) or we could also modify vnl_math_round to round
> half-integers away from zero.
>
> What will not be trivial, is to have a version of vnl_math_round that
> rounds half-integers to the nearest even integer on all platforms.
>
I'm not sure that we need this anti-bias property,
at least not in the context of image registration.
I can certainly see its importance in finance... :-)
but for the purpose of mapping physical coordinates to
image grid IJK numbers, the fraction of point that will
fall directly on .5 locations will already be very small.
I found unlikely that this small set of points may afect
the outcome of a registration.
>
> 3) Which half-integer rounding mode should be preferred?
>
> That's a tough question.
> * As mentioned by Steve, for statistical analysis, round to even might
> be preferred.
> * Round to even might not be very intuitive at first sight
> * If you use nearest-neighbor interpolation, then you might need round
> up or round down (but not round away from zero) to avoid weird things
> for translations of half integers
> * In most cases, you don't really care :-)
>
We could still provide several implementations in ITK,
but each one properly named.
the rounding strategy shouldn't be hidden in the details
of the implementation, and the compiler options of the day.
It should be clearly marked in the name of the rounding
function.
We should have something very explicit, such as:
itk::Math::Round()
itk::Math::RoundNonBiased()
In this way, when writing ITK code we can select the
rounding strategy that is appropriate for the context.
>
> 4) VNL implementation or ITK implementation?
>
> * VNL seems to be the good place for this since it is already there
> and there is currently no such basic operations in ITK
> * It is easier for the user if only one version is available
> * An ITK version would however be easier to maintain for ITK developpers
> * The all set of rounding functions takes 200 lines in vnl_math.h and
> needs some cmake magic.
>
>
I agree that VNL is in principle the right place
to put these methods. However, given the different
domain emphasis of ITK and VNL, the requirements
for these methods is somehow different.
We have been bitten in the past by the rounding
issues in VNL, and it is not worth the trouble.
Note that we already have an ITK-variant of the
vnl_math_round method in the itkIndex.h file, in
lines 263-290:
http://www.itk.org/cgi-bin/viewcvs.cgi/Code/Common/itkIndex.h?root=Insight&r1=1.56&r2=1.57
That we had to add in order to get around the
platform-dependent issues of VNL.
> My 2cts,
> Tom
>
>
Very valuable 2cts, I must say.
More information about the Insight-developers
mailing list