Re: [Insight-developers] Multi-threading strategies
Tomáš Kazmar
Tomash.Kazmar at seznam.cz
Mon Sep 10 16:48:25 EDT 2007
Have you considered using Boost.Thread library with threadpool?
http://www.boost.org/doc/html/thread.html
http://threadpool.sourceforge.net/
It seems as a good alternative in case zthreads does not prove usable.
Tomas
# ------------ Původní zpráva ------------
# Od: Bill Lorensen <bill.lorensen at gmail.com>
# Předmět: Re: [Insight-developers] Multi-threading strategies
# Datum: 10.9.2007 22:06:59
# ----------------------------------------
# Stephen,
#
# That is a good point. Maybe it's inactive because they moslty solved the
# problem and have a STABLE API.
#
# Bill
#
# On 9/10/07, Stephen R. Aylward <Stephen.Aylward at kitware.com> wrote:
# >
# > More background...
# >
# > We have been wanting to evaluate methods for using thread pools in ITK.
# > In particular, for registration metric computation, thread creation
# > and destruction seems to induce an overhead - particularly for 32+
# > processor systems and during optimization with millions of calls to the
# > metric. Our hope was that thread pools would reduce that overhead.
# >
# > Dan independently found zthreads and began evaluating it. For the sake
# > of evaluation of the technology, and if the technology proves useful, it
# > is probably better for us to begin with an inactive project (if it does
# > what we want) rather than reinventing it ourselves.
# >
# > Of course, I have much to learn about zthreads particularities - it may
# > end up being a bad starting point...I think it is still a good direction
# > to pursue...we are waiting for our funding to start and then we'll jump
# > back into the foray...
# >
# > Stephen
# >
# > Bill Lorensen wrote:
# > > But, looking at the sourceforge site, the zthreads project appears to be
# > > inactive. And questions in the forums about porting seem to go
# > unanswered.
# > >
# > > Bill
# > >
# > >
# > > On 9/10/07, *Stephen R. Aylward* <Stephen.Aylward at kitware.com
# > > <mailto:Stephen.Aylward at kitware.com>> wrote:
# > >
# > > Hi Dan,
# > >
# > > I agree - I wouldn't expect the thread pool to pay off when
# > processing a
# > > single filter - the concept of a pool pays off when processing a
# > > sequence of filters that would otherwise involve multiple thread
# > > creations and destructions.
# > >
# > > Even if zthreads doesn't pay off much for the main ITK pipeline (the
# > > improvement may only be minor for the 2-3 filter pipelines that are
# > > commonly used in ITK programs), I still think we should strongly
# > > consider it since specialized (i.e., tailored, within-filter)
# > > multi-threading is needed for deformable registration, DTI fiber
# > > tracking, registration metric computation, etc.
# > >
# > > Stephen
# > >
# > >
# > >
# > > Blezek, Daniel J (GE, Research) wrote:
# > > > Hi Gaëtan,
# > > >
# > > > I used an ITK Time Probe Collector, which I think reports in
# > seconds.
# > > > I'm a little suprised at the 16 chunk results, and don't trust
# > them.
# > > > I'll try the empty filter, but I think it will be very hard to
# > time,
# > > > perhaps a profile'd run would be more helpful. I'll also post
# > > > results of the bigger radius (didn't occur to me at the time).
# > > >
# > > > To answer Bill's question: I don't think we can conclusively say
# > that
# > > > ZThreads are slower. They seem to be on par with the ITK version,
# > but
# > > > I created the thread pool inside the filter, rather than a global
# > > > pool. I'll refactor the code to create the thread pool outside
# > the
# > > > filter and run this all again.
# > > >
# > > > -dan
# > > >
# > > > -----Original Message----- From: Gaëtan Lehmann
# > > > [mailto: gaetan.lehmann at jouy.inra.fr
# > > <mailto:gaetan.lehmann at jouy.inra.fr>] Sent: Friday, September 07,
# > 2007
# > > > 3:13 PM To: Blezek, Daniel J (GE, Research) Subject: Re:
# > > > [Insight-developers] Multi-threading strategies
# > > >
# > > >
# > > > Hi Dan,
# > > >
# > > > The execution times are in seconds? If yes, can you tell us how
# > you
# > > > have measured the execution times of the median filters ? The
# > result
# > > > with 16 chunks is really surprising, and, from my (small)
# > experience
# > > > in measuring execution times with ITK, can't be explained only by
# > the
# > > > overhead of the thread management.
# > > >
# > > > It would also be interesting to have the execution time of a
# > filter
# > > > which does nothing else than creating the threads (by
# > > implementing an
# > > > empty ThreadedGenerateData() for example).
# > > >
# > > > To have longer execution time, you can simply run the median with
# > a
# > > > bigger radius - the execution times should increase dramatically
# > :-)
# > > >
# > > > Gaëtan
# > > >
# > > >
# > > >
# > > > Le 7 sept. 07 à 20:47, Blezek, Daniel J (GE, Research) a écrit :
# > > >
# > > >> Hi all,
# > > >>
# > > >> I've done some looking around and found the ZThread library
# > > >> (http:// zthread.sourceforge.net/index.html
# > > <http://zthread.sourceforge.net/index.html> &
# > > >> http://sourceforge.net/ projects/zthread). It's cross-platform
# > and
# > > >> purports to compile on Linux and Windows, but I only tried
# > Linux.
# > > >> The library has many constructs for threading including a thread
# > > >> pool execution model where you state how many threads you'd like
# > > >> and then feed it jobs. I replaced the GenerateData in the
# > Median
# > > >> filter with ZThread library calls and ran some tests on a 2 CPU
# > and
# > > >> 8 CPU Linux boxes, running RedHat. I also varied the number of
# > > >> chunks each filter was divided into. ITK uses the number of
# > > >> threads to split the work.
# > > >>
# > > >> The reports below compare the ZThread (MedianZ) with the regular
# > > >> ITK thread model (Median).
# > > >>
# > > >> 8 CPU, 8 chunks Probe Tag Starts Stops Time
# > Median
# > > >> 1 1 0.373023109044879674911499023438MedianZ 1
# > > >> 1 0.410052934079430997371673583984
# > > >>
# > > >> 2 CPU, 2 chunks Probe Tag Starts Stops Time
# > Median
# > > >> 1 1 2.50991311680991202592849731445 MedianZ
# > 1
# > > >> 1 2.42412604950368404388427734375
# > > >>
# > > >> 8 CPU, 16 chunks Probe Tag Starts Stops Time
# > Median
# > > >> 1 1 0.412385921692475676536560058594MedianZ 1
# > > >> 1 2.42693911609239876270294189453
# > > >>
# > > >> 2 CPU, 4 chunks Probe Tag Starts Stops Time
# > Median
# > > >> 1 1 3.93622599844820797443389892578 MedianZ
# > 1
# > > >> 1 4.21256111224647611379623413086
# > > >>
# > > >>
# > > >> I think the 8 CPU, 16 chunks is a bit skewed, as the jobs are
# > short
# > > >> enough that thread synchronization really slows everything down
# > a
# > > >> bit. I imagine 8 way overhead is a bit higher than 2 way. On
# > the 2
# > > >> CPU machine, the overhead was minimal.
# > > >>
# > > >> The Median image filter is a bad example as it runs so quickly:
# > > >> suggestions for a better test are welcome.
# > > >>
# > > >> Here's the relevant code from my testing, I can include all of
# > it
# > > >> for interested parties. There is very little change from
# > > >> itkImageSource's implementation. In this case, I create the
# > > >> threads inside the filter, so thread creation is part of the
# > > >> overhead. In practice they would be in a global accessible pool
# > to
# > > >> be used by all executing filters.
# > > >>
# > > >> Comments welcome, -dan
# > > >>
# > > >>
# > > >>
# > >
# > //--------------------------------------------------------------------
# > > >> -------- template< class TInputImage, class TOutputImage > void
# > > >> MedianZThreadImageFilter<TInputImage, TOutputImage>
# > > >> ::GenerateData() { // Call a method that can be overriden by a
# > > >> subclass to allocate // memory for the filter's outputs
# > > >> this->AllocateOutputs();
# > > >>
# > > >> // Call a method that can be overridden by a subclass to perform
# > //
# > > >> some calculations prior to splitting the main computations into
# > //
# > > >> separate threads this->BeforeThreadedGenerateData();
# > > >>
# > > >>
# > > >> // Do this with ZThread's ZThread::PoolExecutor
# > > >> executor(this->GetMultiThreader()-
# > > >>> GetNumberOfThreads());
# > > >> typename TOutputImage::RegionType splitRegion; int
# > NumberOfPieces =
# > > >> 2 * this->GetMultiThreader()-
# > > >>> GetNumberOfThreads();
# > > >> try { for ( int i = 0; i < NumberOfPieces; i++ ) {
# > ZThreadStruct* s
# > > >> = new ZThreadStruct(); s->threadId = i; s->Filter = this;
# > > >> this->SplitRequestedRegion(s->threadId, NumberOfPieces,
# > > >> splitRegion); s->region = splitRegion; executor.execute ( s ); }
# > //
# > > >> Let it all finish executor.wait(); } catch (
# > > >> ZThread::Synchronization_Exception &e ) {
# > itkGenericExceptionMacro
# > > >> ( << "Error adding runnable to executor: " << e.what() ); }
# > > >>
# > > >> // Call a method that can be overridden by a subclass to perform
# > //
# > > >> some calculations after all the threads have completed
# > > >> this->AfterThreadedGenerateData();
# > > >>
# > > >> }
# > > >>
# > > >>
# > > >>
# > > >> -----Original Message----- From:
# > > >> insight-developers-bounces+blezek=crd.ge.com at itk.org
# > > <mailto:crd.ge.com at itk.org>
# > > >> [mailto: insight-developers-bounces+blezek=crd.ge.com at itk.org
# > > <mailto:insight-developers-bounces+blezek=crd.ge.com at itk.org>] On
# > > >> Behalf Of Torsten Rohlfing
# > > >>
# > > >> Sent: Saturday, July 28, 2007 12:32 PM To:
# > > >> insight-developers at itk.org <mailto:insight-developers at itk.org>
# > > Subject: [Insight-developers]
# > > >> Multi-threading strategies
# > > >>
# > > >> Hi --
# > > >>
# > > >> I think you need to consider also that there's a cost to
# > suspending
# > > >> and re-activating a thread. Do you know how you're going to do
# > it?
# > > >> I assume a condition variable or something?
# > > >>
# > > >> From my personal experience, I can say that I considered this
# > > >> option once over creating new threads, and I tried it to some
# > > >> extent, but it did not lead to any tangible benefit using
# > pthreads
# > > >> on Linux. Basically, the cost of using the condition variable
# > with
# > > >> the added complexity of the implementation completely eliminated
# > > >> any benefit from avoiding thread creation and joining. There may
# > of
# > > >> course be differences depending on your platform and the
# > efficiency
# > > >> of its threads implementation.
# > > >>
# > > >> Which certainly still leaves the one advantage that by keeping
# > > >> threads around you avoid those incredibly annoying thread
# > creation/
# > > >> annihilation messages in gdb ;)
# > > >>
# > > >> Cheers! Torsten
# > > >>
# > > >>> That is definitely the preferred method...go for it! :)
# > > >>>
# > > >>> Stephen
# > > >>>
# > > >>> Blezek, Daniel J (GE, Research) wrote:
# > > >>>> / Hi all,
# > > >>> />/ />/ I was debugging a multi-threaded registration metric
# > > >>> today,
# > > >> and gdb
# > > >>> />/ nicely prints thread creation/destruction messages. In our
# > > >>> current />/ MultiThreader, pthreads are created/joined in the
# > > >>> scatter/gather />/ pattern. For a general filter, this isn't
# > > >>> likely to be a problem, />/ 'cause it executes only once (in
# > > >>> general). For optimization metrics, it />/ may be called
# > > >>> thousands of times,
# > > >> leading
# > > >>> to a huge number of pthreads />/ created/joined. Is this
# > > >>> efficient? Would it be worth while to />/ investigate keeping
# > > >>> threads around, rather than joining them? They />/ could
# > simply
# > > >>> sit idle until they have something to do... This would />/
# > > >>> reduce overhead, but may add complexity, but we only need to
# > get
# > > >>> it />/ right once... />/ />/ Stephen Aylward: any comments?
# > />/
# > > >>> /
# > > >> -- Torsten Rohlfing, PhD SRI International,
# > Neuroscience
# > > >> Program Research Scientist 333 Ravenswood Ave, Menlo
# > > >> Park, CA 94025 Phone: ++1 (650) 859-3379 Fax: ++1 (650)
# > > >> 859-2743 torsten at synapse.sri.com <mailto:torsten at synapse.sri.com
# > >
# > > >> http://www.stanford.edu/~rohlfing/
# > > >>
# > > >> "Though this be madness, yet there is a method in't"
# > > >>
# > > >> _______________________________________________
# > Insight-developers
# > > >> mailing list Insight-developers at itk.org
# > > <mailto:Insight-developers at itk.org>
# > > >> http://www.itk.org/mailman/listinfo/insight-developers
# > > <http://www.itk.org/mailman/listinfo/insight-developers>
# > > >
# > > > -- Gaëtan Lehmann Biologie du Développement et de la Reproduction
# > > > INRA de Jouy-en-Josas (France) tel: +33 1 34 65 29 66 fax: 01
# > 34
# > > > 65 29 09 http://voxel.jouy.inra.fr <http://voxel.jouy.inra.fr>
# > > >
# > > >
# > > >
# > > > _______________________________________________
# > Insight-developers
# > > > mailing list Insight-developers at itk.org
# > > <mailto:Insight-developers at itk.org>
# > > > http://www.itk.org/mailman/listinfo/insight-developers
# > > >
# > >
# > > --
# > > =============================================================
# > > Stephen R. Aylward, Ph.D.
# > > Chief Medical Scientist
# > > Kitware, Inc. - Chapel Hill Office
# > > http://www.kitware.com
# > > Phone: (518)371-3971 x300
# > >
# > >
# >
# > --
# > =============================================================
# > Stephen R. Aylward, Ph.D.
# > Chief Medical Scientist
# > Kitware, Inc. - Chapel Hill Office
# > http://www.kitware.com
# > Phone: (518)371-3971 x300
# >
#
#
#
More information about the Insight-developers
mailing list