[Insight-developers] Performance Improvements for DiscreteGaussianImageFilter
Bradley Lowekamp
blowekamp at mail.nih.gov
Tue Jun 1 13:33:12 EDT 2010
Hello,
http://www.itk.org/Bug/view.php?id=10785
I have noticed a few oddities in the way that DiscreteGaussianImageFilter is implement which impact it's performance. Unlike The recursive version, this filter is streamable, so I find it useful in many situations.
What is unique about this filter is that internally it utilized StreamingImageFilter, to stream it's internal composite pipeline. It sets the number of streaming regions to ImageDimensions^2, with out any user control. Due to the way pipeline is implemented, it can result is extra computations in the overlap regions, and reduced multi-threaded CPU utilization due to excessive number of internal filters being executed. The can be particularly bad when the pipeline is already streaming, and the filter is also "sub-streaming". I propose the following improvements:
1) Enable user settable number of division with an IVAR:
/** \brief Set/Get number of pieces to divide the input on the
* internal composite pipeline. The upstream pipeline will not be
* effected.
*
* The default value is $ImageDimension^2$.
*
* This parameter was introduced to reduce the size of images
* internally, at the cost of performance.
*/
itkSetMacro(InternalNumberOfStreamDivisions,unsigned int);
itkGetConstReferenceMacro(InternalNumberOfStreamDivisions,unsigned int);
2) Reverse the internal pipeline. Reverse 1-D convolutions from XYZ to ZYX. Because the streamer slices in the z direction, the remaining dimensions are already expanded and will not grow in the GenerateInputRequestedRegion phase of the pipeline. Therefore to minimize the computation the slice direction Z should be done first. This results in no recompilation or overlapping regions from the convolution filters.
Depending on the size of the image, and the amount of cache the performance gain by this improvement is 20-60%. I have attached the enhanced filter and a test driver.
creating image with size: [128, 128, 128]
randomizing image
=======================================
Executed base with variance 0.0625 in 0.0746817s
Executed base with variance 0.25 in 0.105442s
Executed base with variance 1 in 0.122377s
Executed base with variance 4 in 0.202058s
Executed base with variance 16 in 0.517367s
Executed base with variance 64 in 2.2312s
=======================================
Executed improved ordered with variance 0.0625 in 0.0739802s
speedup 0.00939382
Executed improved ordered with variance 0.25 in 0.102979s
speedup 0.0233524
Executed improved ordered with variance 1 in 0.117431s
speedup 0.0404161
Executed improved ordered with variance 4 in 0.176935s
speedup 0.124332
Executed improved ordered with variance 16 in 0.349574s
speedup 0.324321
Executed improved ordered with variance 64 in 0.969516s
speedup 0.565473
=======================================
Executed with 1 stream division with variance 0.0625 in 0.0664336s
speedup 0.110444
Executed with 1 stream division with variance 0.25 in 0.0864969s
speedup 0.179669
Executed with 1 stream division with variance 1 in 0.110848s
speedup 0.0942072
Executed with 1 stream division with variance 4 in 0.168331s
speedup 0.166916
Executed with 1 stream division with variance 16 in 0.394602s
speedup 0.237289
Executed with 1 stream division with variance 64 in 1.15303s
speedup 0.483224
=======================================
Executed recursive gaussian with variance 0.0625 in 0.0472106s
speedup 0.367842
Executed recursive gaussian with variance 0.25 in 0.0468103s
speedup 0.556054
Executed recursive gaussian with variance 1 in 0.0453837s
speedup 0.629148
Executed recursive gaussian with variance 4 in 0.0511486s
speedup 0.746861
Executed recursive gaussian with variance 16 in 0.0571482s
speedup 0.88954
Executed recursive gaussian with variance 64 in 0.0671696s
speedup 0.969895
Clearly the SmoothingRecursiveGaussianImageFilter is the performance winner here (the synthetic image may be giving accurate results), and future efforts should really include making this filter streamable.
However, I still think these improvements are important to contribute. As I believe they are fully backwards compatible, I will try to commit them this week if no one sees any issues.
Thanks,
Brad
========================================================
Bradley Lowekamp
Lockheed Martin Contractor for
Office of High Performance Computing and Communications
National Library of Medicine
blowekamp at mail.nih.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: itkLocalDiscreteGaussianImageFilter.h
Type: application/octet-stream
Size: 9628 bytes
Desc: not available
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: itkLocalDiscreteGaussianImageFilter.txx
Type: application/octet-stream
Size: 11942 bytes
Desc: not available
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: itkGaussianStreamingPerformance.cxx
Type: application/octet-stream
Size: 4822 bytes
Desc: not available
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.itk.org/mailman/private/insight-developers/attachments/20100601/8e0fa1d2/attachment-0003.htm>
More information about the Insight-developers
mailing list