[Insight-users] RE: [Insight-developers] Profiling ofExamples/ImageLinearIteratorWithIndex.cxx

Miller, James V (Research) millerjv at crd.ge.com
Thu Jun 2 22:24:13 EDT 2005


Karl,

This is a nice experiment. We could certainly add an ImageLinearIterator
in constrast to the ImageLinearIteratorWithIndex much like we have an 
ImageRegionIterator and an ImageRegionWithIndex.  The intent of the 
WithIndex variants was to provide direct access to the index for those 
algorithms that needed it.

The Get()/Set() vs Value() argument is one of supporting ImageAdaptors.
Algorithms that use Get()/Set() can support ImageAdaptors.  Algorithms that
use Value() cannot use ImageAdaptors.  This is a decision that must be 
made carefully.  For instance, the NeighborhoodIterators do not support
Value() for similar efficiency reasons.  Perhaps there could be another
mechanism to identify when ImageAdaptors are not be used and Set/Get 
can use a faster path to the data.  However, part of the speed of Value()
is that allows you to write algorithms in a manner that avoids the creation of 
temporaries.  Something that the Set/Get approach sometimes cannot.

We ran a number of similar experiments when we first developed the 
ImageRegionIterator and the ImageRegionIteratorWithIndex iterators. 
I don't think we ever tried to time the LinearIterators.  However,
one thing we found with timing the iterators that results would change
drastically depending on which iterator you used first.  For instance, 
for a simple test that traverses a volume and sets/gets every pixel. 
IteratorA may take timeA to traverse the volume and IteratorB may
take timeB to traverse the volume with timeA >> timeB.  When the 
order of the experiment was reversed, IteratorB took timeB2 and IteratorA
took timeA2 wit timeB2 >> timeA2. This inconsistency made it difficult
to come to any real conclusions of the relative timings. Relating this
to your experiment, the magnitude of the differences may not really be
as extreme as the numbers below.  But I do believe much of the timing
differences are real.

Finally, while using gcc is a real world scenario, historically it has
not had the best optimizer.  I have had cases where gcc compiled code
had severe bottlenecks compared to DevStudio .Net compiled code.

Since iterators are such an integral part of ITK, we should continue
to experiment with methods to improve performance. Hopefully, there 
are opportunities to improve performance while still supporting 
ImageAdaptors (for backward compatibility).  But we should also
look for mechanisms and opportunities to improve algorithm performance
where we can reasonably identify ImageAdaptors are not being used.

Jim





-----Original Message-----
From: insight-developers-bounces+millerjv=crd.ge.com at itk.org
[mailto:insight-developers-bounces+millerjv=crd.ge.com at itk.org]On Behalf
Of Karl Krissian
Sent: Thursday, June 02, 2005 6:41 PM
To: ITK; itk users
Subject: [Insight-developers] Profiling
ofExamples/ImageLinearIteratorWithIndex.cxx



Hi,

I decided to compare the processing time of some simple itk iterator
example with
its equivalent in C.

I think the result can be interesting to ITK community.
I used a ITK version on linux (mobile pentium centrino 1.7GHz)
compiled with profiling and optimization: -pg -O3 and the profiler is
gprof (GNU).

I added the following classes for the experiment:

Code/Common/itkImageLinearIteratorWithIndex2.h
Code/Common/itkImageLinearIteratorWithIndex2.txx

Code/Common/itkImageLinearConstIteratorWithIndex2.h
Code/Common/itkImageLinearConstIteratorWithIndex2.txx

and changed the example:
Examples/Iterators/ImageLinearIteratorWithIndex.cxx

The code is attached to this email.

The new ImageLinearIteratorWithIndex2 could also be called
ImageLinearIteratorWithoutIndex
because it does not update the index during the ++ and -- operations
which speed up
the evolution.

The ImageLinearIteratorWithIndex example does basically a flip of an RGB
image in the X direction.
The idea is to compare the time of this operation using ITK with the
time of the equivalent
operation using standard C programming (directly accessing pointers to
the data).

I created different procedure with some slight changes to compare their
speed:

1. ProcessITK is the original code
2. ProcessITK1 replaces inputIt.Get() by inputIt.Value()
3. ProcessITK2 replaces outputIt.Set( inputIt.Value() )  by
outputIt.Value().Set(inputIt.Value().GetRed(),inputIt.Value().GetGreen(),inputIt.Value().GetBlue())
4. ProcessITK3 is like ProcessITK2 but using the new Iterator
5. ProcessITK4 is like ProcessITK3 but replaces the ++ and -- operations
but IncPos() and DecPos() which are actual ++ and -- on the pointers
6. ProcessPointer does the same operation (without ITK generality) in a
C style.

The results are the following:

1.   17.51 sec
2.     9.94 sec
3.     3.54 sec
4.     1.64 sec
5.     0.81 sec
6.     0.62 sec

The details are in the file 'profile' but in summary:

1 --> 2 : we avoid creating and deleting an RGB value, which saves
approx. 6 sec (FixedArray constructor and destructor)
2 --> 3 : we avoid the operator= of FixedArray (loops over the number of
elements) and we save 6.74 sec
3 --> 4: not updating the index in the iterator decreases the time of ++
and -- operators, GoToEndOfLine() and NextLine() are also faster
4 --> 5: using ++ and -- instead of += m_Jump and -= m_Jump gains 1.1 sec
5 --> 6: there is still some overhead in the iterator, but a small
difference.

Surprisingly, the procedure GoToBegin() takes 0.05 sec and is only
called twice,
and most of its time is spent calling
itk::ImageRegion<3u>::GetNumberOfPixels() const,
which just multiplies the different dimensions and put the result in a
unsigned long (is it a bug of the processor or of the profiler??...).


Anyway, I think this experiment can be instructive, and it shows that
C++ can be as fast as C,
but with a lot of care.
Also some of the generality of itk is lost (like cast from one type to
another), but for specific filters it is probably be worth.

Any comment is welcome,


Karl



-- 
Karl Krissian, PhD
Instructor in Radiology, Harvard Medical School
Laboratory of Mathematics in Imaging, Brigham and Women's Hospital
Tel:617-525-6232, Fax:617-525-6220






More information about the Insight-users mailing list