<div dir="ltr"><div>(re-sent for the rest of the dev list)<br></div><div>Hi Bradley,<br><br></div>It's pretty fast. The interesting numbers
are for 20, 40, 80, and 160. That aligns with 1:1, 2:1, 4:1, and 8:1
threads to core ratio. Starting from the already configured
ITKLinuxPOWER8 currently being built, I did a ninja clean and then "time
ninja -jN". Watching the cpu load for 20, 40, and 80 cores though, I
see a fair amount of both process migration and unbalanced thread
distribution, i.e. for -j20 I'll often see 2 cores with 6 or 8 threads
and the rest with only 1 or 2. So in addition to the -jN settings, I
also ran 20, 40, and 80 threads using numactl with fixed binding to
physical CPU cores to evenly distribute the threads across cores and
prevent thread migration. See timings below in seconds:<br><br><table dir="ltr" style="table-layout:fixed;font-size:13px;font-family:arial,sans,sans-serif;border-collapse:collapse;border:1px solid rgb(204,204,204)" border="1" cellpadding="0" cellspacing="0" height="204" width="481"><colgroup><col width="103"><col width="59"><col width="66"><col width="59"><col width="100"></colgroup><tbody><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">Threads</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">Real</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">User</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">Sys</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">Total CPU Time</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">20</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">1037.097</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">19866.685</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">429.796</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">20296.481</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>(Numa Bind) 20</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>915.910</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>16290.589</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>319.017</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>16609.606</b></td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">40</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">713.772</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">26953.663</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">556.960</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">27510.623</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">(Numa Bind) 40</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">641.924</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">22442.685</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">432.379</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">22875.064</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">80</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">588.357</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">40970.439</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">822.944</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">41793.383</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>(Numa Bind) 80</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>538.801</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>35366.297</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>637.922</b></td><td style="padding:2px 3px;vertical-align:bottom;text-align:right"><b>36004.219</b></td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">160</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">572.492</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">62542.901</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">1289.864</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">63832.765</td></tr><tr style="height:21px"><td style="padding:2px 3px;vertical-align:bottom;text-align:right">(Numa Bind) 160</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">549.742</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">61864.666</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">1242.975</td><td style="padding:2px 3px;vertical-align:bottom;text-align:right">63107.641</td></tr></tbody></table><br><div><br><br>So it seems like core binding gives us an approximate 10%
performance increase for all thread configurations. And while clearly
the core-locked 4:1 gave us the best time, looking at the total CPU time
(user+sys) the 1:1 looks to be the most efficient for actual cycles
used.<br><br></div><div>It's interesting to watch how the whole system
gets used up for most of the build but everything gets periodically
gated on a handful of linker processes. And of course, it's always
cool to see a screen cap of htop with a whole boat load of cores at 100%<div class=""><div id=":24e" class="" tabindex="0"><img class="" src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"><br></div></div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr">- Chuck<br></div></div></div>
<br><div class="gmail_quote">On Thu, Apr 23, 2015 at 10:01 AM, Bradley Lowekamp <span dir="ltr"><<a href="mailto:blowekamp@mail.nih.gov" target="_blank">blowekamp@mail.nih.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Matt,<br>
<br>
I'd love to explore the build performance of this system.<br>
<br>
Any chance you could run clean builds of ITK on this system with 20,40,60,80,100,120,140 and 160 processes and record the timings?<br>
<br>
I am very curious how this unique systems scales with multiple heavy weight processes, as it's design appears to be uniquely suitable to lighter weight multi-threading.<br>
<br>
Thanks,<br>
Brad<br>
<div><div class="h5"><br>
On Apr 22, 2015, at 11:51 PM, Matt McCormick <<a href="mailto:matt.mccormick@kitware.com">matt.mccormick@kitware.com</a>> wrote:<br>
<br>
> Hi folks,<br>
><br>
> With thanks to Chuck Atkins and FSF France, we have a new build on the<br>
> dashboard [1] for the IBM POWER8 [2] system. This is a PowerPC64<br>
> system with 20 cores and 8 threads per core -- a great system where we<br>
> can test and improve ITK parallel computing performance!<br>
><br>
><br>
> To generate a test build on Gerrit, add<br>
><br>
> request build: power8<br>
><br>
> in a review's comments.<br>
><br>
><br>
> There are currently some build warnings and test failures that should<br>
> be addressed before we will be able to use the system effectively. Any<br>
> help here is appreciated.<br>
><br>
> Thanks,<br>
> Matt<br>
><br>
><br>
> [1] <a href="https://open.cdash.org/index.php?project=Insight&date=2015-04-22&filtercount=1&showfilters=1&field1=site/string&compare1=63&value1=gcc112" target="_blank">https://open.cdash.org/index.php?project=Insight&date=2015-04-22&filtercount=1&showfilters=1&field1=site/string&compare1=63&value1=gcc112</a><br>
><br>
> [2] <a href="https://en.wikipedia.org/wiki/POWER8" target="_blank">https://en.wikipedia.org/wiki/POWER8</a><br>
</div></div>> _______________________________________________<br>
> Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>
><br>
> Visit other Kitware open-source projects at<br>
> <a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>
><br>
> Kitware offers ITK Training Courses, for more information visit:<br>
> <a href="http://kitware.com/products/protraining.php" target="_blank">http://kitware.com/products/protraining.php</a><br>
><br>
> Please keep messages on-topic and check the ITK FAQ at:<br>
> <a href="http://www.itk.org/Wiki/ITK_FAQ" target="_blank">http://www.itk.org/Wiki/ITK_FAQ</a><br>
><br>
> Follow this link to subscribe/unsubscribe:<br>
> <a href="http://public.kitware.com/mailman/listinfo/insight-developers" target="_blank">http://public.kitware.com/mailman/listinfo/insight-developers</a><br>
> _______________________________________________<br>
> Community mailing list<br>
> <a href="mailto:Community@itk.org">Community@itk.org</a><br>
> <a href="http://public.kitware.com/mailman/listinfo/community" target="_blank">http://public.kitware.com/mailman/listinfo/community</a><br>
<br>
</blockquote></div><br></div></div>