[vtkusers] Running VTK filters thread-parallel using OpenMP => can that work at all? -- Is vtkMultiThreader the way to go?
Michael Bachmann
Michael.Bachmann at synopsys.com
Fri Feb 18 12:21:54 EST 2011
Hello Gib,
Having a CPU with four cores and getting a performance improvement of a factor of about 3.5 is not bad, right? Actually the implementation scales quite well.
In my experience OpenMP, is not disappointing in general when it comes to speeding up computations. There are use-cases were it doesn't scale well, but not in general.
So: Any hints? Anyone?
Kind regards,
Michael
Hi Michael,
Without making particular reference to VTK, I'd just like to make the
comment that getting OpenMP to work isn't the same as getting
worthwhile performance improvements from OpenMP. Caching issues,
memory bus bandwidth constraints etc. often make the whole experience
quite disappointing.
cheers
Gib
Quoting Michael Bachmann <Michael.Bachmann at synopsys.com>:
> Hi all!
>
> I implemented a functionality similar to the one sketched in the
> code snippet below this mail.
>
> Having read the information in the Wiki FAQ section
> (http://www.vtk.org/Wiki/VTK/FAQ#Is_VTK_thread-safe_.3F) I thought
> it should be possible to run different instances of the same
> pipeline thread parallel, due to the following comment (despite the
> complete FAQ section is somewhat contradictorily):
>
> "However, if you are not interested in developing multithreaded
> filters but want to process some data in parallel using the same (or
> similar) pipeline, your job is much easier. To do this, create a
> different copy of the pipeline on each thread and execute them in
> parallel on a different piece of the data. This is best accomplished
> by using vtkThreadedController (instead of vtkMultiThreader). See
> the documentation of vtkMultiProcessController and
> vtkThreadedController and the examples in the parallel directory for
> details on how this can be done."
>
> Well, using OpenMP for threading turned out to be not the way to go,
> since data-race conditions occur at low levels in vtk involving
> classes like in vtkInformation, vtkInformationVector and
> vtkGarbageCollector. The result is: unpredictable crashes, missing
> chunks of result data.
>
> From the example/test code which comes along with vtkMultiThreader I
> can't see what this class accomplishes in a different way, what the
> omp code below doesn't do also: the omp critical section introduces
> a mutex which prevents parallel access to the container
> "ResultDataChunks". When using a vtkMultiThreader it is necessary to
> work with vtkMutexLock instances and avoid race conditions manually.
>
> What about the vtkInformation and vtkInformationVector data races?
> Does vtkMultiThreader push internally some buttons making it safe to
> perform a thread-parallel code execution?
>
> It took some time to get the current state of code running and I do
> not want to invest too much time on a futile attempt of trying to
> get the code running thread-parallel. Well, this would be the cherry
> on the top and I'd like to have that one :)
>
> vtkThreadedController has been mentioned in the FAQ section referred
> above, which is not a part of VTK 5.x.x anymore. Somebody wrote in a
> mailing list that porting it was not so hard to do.
> But since it has been cancelled I think there was a solid reason to do so.
> In the code of some examples and tests vtkThreadedController is
> still referred to as "the default" which is used in case there is no
> vtkMultiProcessController available, but like already stated it is
> not a part of vtk anymore.
>
> There is also a vtkThreadedStreamingPipeline, but this
> implementation seems to have a different goal, since this class is
> intended to drive filters which are capable of handling data
> processed by different threads.
>
> But I want to run classes like vtkImageMarchingCubes, vtkThreshold,
> vtkClipPolyData, vtkCleanPolyData, .? in parallel which are not
> aware of different threads.
>
> Any ideas?
>
> Or do I have to register some kind of prototype-object somewhere
> which makes the whole stuff thread safe, just like when registering
> a vtkThreadedStreamingPipeline via
> "vtkAlgorithm::SetDefaultExecutivePrototype(vtkThreadedStreamingPipeline::New());"? ??? That would be nice
> :)
>
> Is multi-threading (the way I want to do it) possible or just a
> futile attempt with the current source state of vtk?
>
> Best regards,
>
> Michael
>
>
> vtkSmartPointer<vtkPolyData>
> Compute_Workpackages_Thread_Parallel(vtkImageData* InputData)
> {
> ?? std::vector< vtkSmartPointer<vtkPolyData> > ResultDataChunks;
>
> ? // Divide the complete work into work-packages which can be
> handled by different threads
> ?? std::vector< Workpackage > Workpackages( Define_Workpackages(InputData) );
> ?? long lWholeSize(Workpackages.size());
>
> #pragma omp parallel
> ?? {
> // NOTE: every thread operates on its own private instance of
> ResultDataChunksPerThread
> // when inside the "omp parallel" block
> ??????std::vector< vtkSmartPointer<vtkPolyData> > ResultDataChunksPerThread;
>
> #pragma omp for schedule(dynamic)
> ????? for (long i = 0; i < lWholeSize; ++i)
> ????? {
> ???????? // the code inside Compute_Workpackage_Result operates with
> // with a pipeline consisting of filter-classes like:
> // - vtkImageMarchingCubes
> // - vtkThreshold
> // - vtkClipPolyData
> // - vtkCleanPolyData
> // - vtkTriangleFilter
> // - vtkStripper
> // - .
> ?????????// just to name a few.
> ???????? ResultDataChunksPerThread.push_back(
> Compute_Workpackage_Result(Workpackages[i], InputData) );
> ????? }
>
> #pragma omp critical
> ????? {
> ???????? // collect the thread results
> ????????
> ResultDataChunks.insert(ResultDataChunks.end(),ResultDataChunksPerThread.begin(),ResultDataChunksPerThread.end());
> ????? }
>
> ?? } // omp parallel
>
> ?? // join the collected thread-results into one vtkPolyData object:
> ???return Join_PolyData_Chunks(ResultDataChunks);
> }
More information about the vtkusers
mailing list