[Insight-developers] Multi-threading strategies
Gaëtan Lehmann
gaetan.lehmann at jouy.inra.fr
Fri Sep 7 15:13:20 EDT 2007
Hi Dan,
The execution times are in seconds?
If yes, can you tell us how you have measured the execution times of
the median filters ? The result with 16 chunks is really surprising,
and, from my (small) experience in measuring execution times with
ITK, can't be explained only by the overhead of the thread management.
It would also be interesting to have the execution time of a filter
which does nothing else than creating the threads (by implementing an
empty ThreadedGenerateData() for example).
To have longer execution time, you can simply run the median with a
bigger radius - the execution times should increase dramatically :-)
Gaëtan
Le 7 sept. 07 à 20:47, Blezek, Daniel J (GE, Research) a écrit :
> Hi all,
>
> I've done some looking around and found the ZThread library (http://
> zthread.sourceforge.net/index.html & http://sourceforge.net/
> projects/zthread). It's cross-platform and purports to compile on
> Linux and Windows, but I only tried Linux. The library has many
> constructs for threading including a thread pool execution model
> where you state how many threads you'd like and then feed it
> jobs. I replaced the GenerateData in the Median filter with
> ZThread library calls and ran some tests on a 2 CPU and 8 CPU Linux
> boxes, running RedHat. I also varied the number of chunks each
> filter was divided into. ITK uses the number of threads to split
> the work.
>
> The reports below compare the ZThread (MedianZ) with the regular
> ITK thread model (Median).
>
> 8 CPU, 8 chunks
> Probe Tag Starts Stops Time
> Median 1 1
> 0.373023109044879674911499023438
> MedianZ 1 1
> 0.410052934079430997371673583984
>
> 2 CPU, 2 chunks
> Probe Tag Starts Stops Time
> Median 1 1
> 2.50991311680991202592849731445
> MedianZ 1 1
> 2.42412604950368404388427734375
>
> 8 CPU, 16 chunks
> Probe Tag Starts Stops Time
> Median 1 1
> 0.412385921692475676536560058594
> MedianZ 1 1
> 2.42693911609239876270294189453
>
> 2 CPU, 4 chunks
> Probe Tag Starts Stops Time
> Median 1 1
> 3.93622599844820797443389892578
> MedianZ 1 1
> 4.21256111224647611379623413086
>
>
> I think the 8 CPU, 16 chunks is a bit skewed, as the jobs are short
> enough that thread synchronization really slows everything down a
> bit. I imagine 8 way overhead is a bit higher than 2 way. On the
> 2 CPU machine, the overhead was minimal.
>
> The Median image filter is a bad example as it runs so quickly:
> suggestions for a better test are welcome.
>
> Here's the relevant code from my testing, I can include all of it
> for interested parties. There is very little change from
> itkImageSource's implementation. In this case, I create the
> threads inside the filter, so thread creation is part of the
> overhead. In practice they would be in a global accessible pool to
> be used by all executing filters.
>
> Comments welcome,
> -dan
>
>
> //--------------------------------------------------------------------
> --------
> template< class TInputImage, class TOutputImage >
> void
> MedianZThreadImageFilter<TInputImage, TOutputImage>
> ::GenerateData()
> {
> // Call a method that can be overriden by a subclass to allocate
> // memory for the filter's outputs
> this->AllocateOutputs();
>
> // Call a method that can be overridden by a subclass to perform
> // some calculations prior to splitting the main computations into
> // separate threads
> this->BeforeThreadedGenerateData();
>
>
> // Do this with ZThread's
> ZThread::PoolExecutor executor(this->GetMultiThreader()-
> >GetNumberOfThreads());
> typename TOutputImage::RegionType splitRegion;
> int NumberOfPieces = 2 * this->GetMultiThreader()-
> >GetNumberOfThreads();
> try
> {
> for ( int i = 0; i < NumberOfPieces; i++ )
> {
> ZThreadStruct* s = new ZThreadStruct();
> s->threadId = i;
> s->Filter = this;
> this->SplitRequestedRegion(s->threadId, NumberOfPieces,
> splitRegion);
> s->region = splitRegion;
> executor.execute ( s );
> }
> // Let it all finish
> executor.wait();
> }
> catch ( ZThread::Synchronization_Exception &e )
> {
> itkGenericExceptionMacro ( << "Error adding runnable to
> executor: " << e.what() );
> }
>
> // Call a method that can be overridden by a subclass to perform
> // some calculations after all the threads have completed
> this->AfterThreadedGenerateData();
>
> }
>
>
>
> -----Original Message-----
> From: insight-developers-bounces+blezek=crd.ge.com at itk.org
> [mailto:insight-developers-bounces+blezek=crd.ge.com at itk.org] On
> Behalf Of Torsten Rohlfing
>
> Sent: Saturday, July 28, 2007 12:32 PM
> To: insight-developers at itk.org
> Subject: [Insight-developers] Multi-threading strategies
>
> Hi --
>
> I think you need to consider also that there's a cost to suspending
> and re-activating a thread. Do you know how you're going to do it?
> I assume a condition variable or something?
>
> From my personal experience, I can say that I considered this
> option once over creating new threads, and I tried it to some
> extent, but it did not lead to any tangible benefit using pthreads
> on Linux. Basically, the cost of using the condition variable with
> the added complexity of the implementation completely eliminated
> any benefit from avoiding thread creation and joining. There may of
> course be differences depending on your platform and the efficiency
> of its threads implementation.
>
> Which certainly still leaves the one advantage that by keeping
> threads around you avoid those incredibly annoying thread creation/
> annihilation messages in gdb ;)
>
> Cheers!
> Torsten
>
> > That is definitely the preferred method...go for it! :)
> >
> > Stephen
> >
> > Blezek, Daniel J (GE, Research) wrote:
> > >/ Hi all,
> > />/
> > />/ I was debugging a multi-threaded registration metric today,
> and gdb
> > />/ nicely prints thread creation/destruction messages. In our
> > current />/ MultiThreader, pthreads are created/joined in the
> > scatter/gather />/ pattern. For a general filter, this isn't likely
> > to be a problem, />/ 'cause it executes only once (in general). For
> > optimization metrics, it />/ may be called thousands of times,
> leading
> > to a huge number of pthreads />/ created/joined. Is this efficient?
> > Would it be worth while to />/ investigate keeping threads around,
> > rather than joining them? They />/ could simply sit idle until they
> > have something to do... This would />/ reduce overhead, but may add
> > complexity, but we only need to get it />/ right once...
> > />/
> > />/ Stephen Aylward: any comments?
> > />/ /
>
> --
> Torsten Rohlfing, PhD SRI International, Neuroscience Program
> Research Scientist 333 Ravenswood Ave, Menlo Park, CA
> 94025
> Phone: ++1 (650) 859-3379 Fax: ++1 (650) 859-2743
> torsten at synapse.sri.com http://www.stanford.edu/~rohlfing/
>
> "Though this be madness, yet there is a method in't"
>
> _______________________________________________
> Insight-developers mailing list
> Insight-developers at itk.org
> http://www.itk.org/mailman/listinfo/insight-developers
--
Gaëtan Lehmann
Biologie du Développement et de la Reproduction
INRA de Jouy-en-Josas (France)
tel: +33 1 34 65 29 66 fax: 01 34 65 29 09
http://voxel.jouy.inra.fr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: =?ISO-8859-1?Q?Ceci_est_une_signature_=E9lectronique_PGP?=
Url : http://www.itk.org/mailman/private/insight-developers/attachments/20070907/fe5d00bf/PGP.pgp
More information about the Insight-developers
mailing list