<div dir="ltr">Very nice David. Thanks for your contributions. I have found SMPTools to be extremely easy to use and the fact that it takes care of load balancing really simplifies the code and yields good results.<div>Best,</div><div>W</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 10, 2016 at 8:19 AM, David Gobbi <span dir="ltr"><<a href="mailto:david.gobbi@gmail.com" target="_blank">david.gobbi@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi All,</div><div><br></div><div>As of today, the VTK imaging pipeline has a new implementation based</div><div>on vtkSMPTools. A big chunk of this work is from the "Shared Memory</div><div>Parallelism" Google Summer of Code project for 2015.</div><div><br></div><div>Many of you will not see much difference, because the the imaging</div><div>pipeline has been multithreaded for many, many years. The advantages</div><div>of the new implementation are that it provides more performance tuning,</div><div>and it will be easier to maintain as VTK moves forward. The new SMP</div><div>code will only be used if you set VTK_SMP_IMPLEMENTATION_TYPE</div><div>to OpenMP or TBB in cmake. Otherwise, the old threading code will be</div><div>used.</div><div><br></div><div>The main difference with the new implementation is the load balancing,</div><div>Previously, the data was divided evenly (or roughly so) among the threads.</div><div>So if there were 10 slices and 8 threads, then 6 of the threads would get</div><div>one slice, and 2 of the threads would get 2 slices. Now, things follow</div><div>a different paradigm: the data is divided into a large number of pieces</div><div>that are queued for a thread pool. Pieces are assigned to threads based</div><div>on thread availability, and overall CPU utilization is improved because</div><div>the load balancing is done dynamically.</div><div><br></div><div><br></div><div>So, what performance tuning is available and what gains can you expect</div><div>to see? Let's look at the tuning first. The following new methods are</div><div>available for image filters derived from vtkThreadedImageAlgorithm:</div><div><br></div><div>// Enable or disable the new behavior for all filters.</div><div>static void SetGlobalDefaultEnableSMP(bool);</div><div><br></div><div>// Enable or disable the new behavior for just one filter.</div><div>void SetEnableSMP(bool);</div><div><br></div><div>// Set the size of the image pieces that will be sent to the</div><div>// threads for execution. The ideal size will depend on the</div><div>// memory use pattern of the image filter that is being used,</div><div>// but the default size of 65536 bytes works well for most.</div><div>void SetDesiredBytesPerPiece(vtkIdType bytes);</div><div><br></div><div>// Set the minimum size of piece to send to a thread. Obviously</div><div>// giving the threads one voxel at a time would be inefficient.</div><div>// A default minimum size of [16,1,1] ensures some contiguity.</div><div>void SetMinimumPieceSize(const int size[3]);</div><div><br></div><div>// Use pieces that are roughly square in shape (or cubic for 3D</div><div>// images). This provides best results for filters that operate</div><div>// on a neighborhood around each output voxel.</div><div>void SetSplitModeToBlock();</div><div><br></div><div>// Use slab-shaped pieces. This provides best results for filters</div><div>// that perform simple operations on the scalars, such as color mapping.</div><div>void SetSplitModeToSlab();</div><div><br></div><div>// Use thin rod-shaped pieces. This also provides good results</div><div>// filters like color mapping. I haven't yet found any algorithms</div><div>// for which this splitting method is the best to use.</div><div>void SetSplitModeToBeam();</div><div><br></div><div><br></div><div>The performance improvements to be gained by tweaking these parameters</div><div>are modest, usually less than 20%, but sometimes much higher. As part</div><div>of this patch, I have added a new example to VTK called "ImageBenchmark"</div><div>that makes it easy to run a filter under different conditions in order to optimize</div><div>the settings. I'll create a wiki page in the future, but for now, you can run "ImageBenchmark --help" to get a comprehensive description of all the</div><div>options (assuming that you built VTK with BUILD_EXAMPLES=ON).</div><div><br></div><div>Cheers,</div><div> - David</div></div>
<br>_______________________________________________<br>
Powered by <a href="http://www.kitware.com" rel="noreferrer" target="_blank">www.kitware.com</a><br>
<br>
Visit other Kitware open-source projects at <a href="http://www.kitware.com/opensource/opensource.html" rel="noreferrer" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>
<br>
Search the list archives at: <a href="http://markmail.org/search/?q=vtk-developers" rel="noreferrer" target="_blank">http://markmail.org/search/?q=vtk-developers</a><br>
<br>
Follow this link to subscribe/unsubscribe:<br>
<a href="http://public.kitware.com/mailman/listinfo/vtk-developers" rel="noreferrer" target="_blank">http://public.kitware.com/mailman/listinfo/vtk-developers</a><br>
<br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div>William J. Schroeder, PhD<br>Kitware, Inc. - Building the World's Technical Computing Software<br>28 Corporate Drive<br>Clifton Park, NY 12065<br><a href="mailto:will.schroeder@kitware.com" target="_blank">will.schroeder@kitware.com</a><br><a href="http://www.kitware.com" target="_blank">http://www.kitware.com</a><br>(518) 881-4902</div></div></div>
</div>