[Paraview] Programmable filter in parallel

Thu Aug 11 11:11:29 EDT 2011

David,

Thanks for your response. It's much clearer how it all works, but I'm still unsure how it fits together.

I don't actually need to know the interprocess links -- I have a list of blocks to read and that list needs to be split over the processors. So each processor needs to identify itself and the total number of procs, but that's all. So I can definitely do that with the mpi4py, I was unaware that would work inside the filter and I didn't know the paraview.vtk.parallel existed.

I'm not actually splitting the structured data; I'm splitting the vtkMultiBlockDataSet. So each processor is responsible for populating a portion of the dataset. For instance, in serial when the file (say, with 8 blocks) is read, we end up with one vtkMultiBlockDataset with 8 vtkStructuredData's inside it. If I have a parallel reader (with 8 processes), I have a hunch I'll end up with 8 vtkMultiBlockDataSet's with one vtkStructuredData under each. Is this correct? Will this cause problems for other filters downstream? If for fun, I wanted to merge it such that each processor still only retains it's block, but they share a common parent vtkMultiBlockDataset, is that possible?

I appreciate your help with this. Maybe these are stupid questions answered somewhere else, but I can't seem to find them!

Tim

----- Original Message -----
From: "David E DeMarle" <dave.demarle at kitware.com>
To: gtg085x at mail.gatech.edu
Cc: "ParaView list" <paraview at paraview.org>
Sent: Thursday, August 11, 2011 9:54:24 AM
Subject: Re: [Paraview] Programmable filter in parallel

ParaView tries to do no aggregation other than rendering onto the same
screen. Each processor is told what portion it is responsible for via
the UPDATE_EXTENT or UPDATE_PIECE/UPDATE_NUMBER_OF_PIECES keys and are
supposed to only produce what it is asked for. (See
http://paraview.org/Wiki/Writing_ParaView_Readers for more of the
story.)

Filters that need cross communication to work properly (beyond what
they can get from ghost cells) do so by accessing the
vtkMultiProcessController that connects all of the nodes in the server
(or sometimes via MPI directly but that isn't recommended).

Try the following for two means of getting a hold of the interprocess links.
import paraview.vtk.parallel
#print(dir(paraview.vtk.parallel))
#print(dir(paraview.vtk.parallel.vtkMultiProcessController))
controller = paraview.vtk.parallel.vtkMultiProcessController.GetGlobalController()
print controller.GetLocalProcessId()
print controller.GetNumberOfProcesses()

from mpi4py import MPI
#print(dir(MPI))
#print(help(MPI))
print MPI.COMM_WORLD.Get_rank()
print MPI.COMM_WORLD.Get_size()

Note also that there is a "feature" in the python programmable filter
that comes into play with structured data. That feature says that
structured data is not split at all by default. If you want structured
data to actually be parallel you need to put this code in your python
programmable filter.

from paraview import util
self.GetExecutive().SetExtentTranslator(self.GetExecutive().GetOutputInformation(0),
vtk.vtkExtentTranslator())

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109

On Wed, Aug 3, 2011 at 11:09 AM, Tim Gallagher <tim.gallagher at gatech.edu> wrote:
> I guess I sort of answered my own question -- the entire script runs on each processor, so I ended up with 8 copies of my data in memory (or I would have, had I not filled the 12 GB of RAM and 20 GB of swap space and my system crashed).
>
> So is there some way to query the processor information? Probably something in the RequestInformation script -- find out how many processors there are and then the prog. filter determines based on processor ID and number of processors what section of the data to load.
>
> In that case, how does the aggregation of the data work? The exact pipeline is:
>
> DataObjectGenerator("MB{}")
> ProgrammableFilter
>
> in serial, the PF appends blocks into the input and passes that through to the output. In parallel, that same pipeline would create a MB{} on each CPU that gets filled with that CPU's data, but at the end of this step I would want a single MB{} object, not NCPU MB{}'s.
>
> Hopefully that makes sense... I've never used PV in parallel, so I'm not sure how it all works.
>
> Tim
>
> ----- Original Message -----
> From: "Tim Gallagher" <tim.gallagher at gatech.edu>
> To: "ParaView list" <paraview at paraview.org>
> Sent: Wednesday, August 3, 2011 9:24:25 AM
> Subject: [Paraview] Programmable filter in parallel
>
> Hi,
>
> I know many of the built-in readers/filters already work in parallel, but how does one write a parallel programmable filter?
>
> Our data files are XDMF and split into blocks of data. We have a single XDMF file that we can read that reads all the blocks and generates a vtkMultiBlockDataset (this works with the built in XDMF reader).
>
> However, each block has some ghost cells around it that are needed to do the CellDataToPointData interpolation. For large numbers of blocks, this creates far too many grid points for our machines to load. So, I've written a programmable filter that does:
>
> start with empty vtkMultiBlockDataset
> for each block in restart file
>   read block file with XDMFReader
>   CellDataToPointData
>   strip off the extra layers of cells
>   append to output vtkMultiBlockDataset
>
> If I run this in parallel, what exactly is parallel? Is the reading and CD2PD done in parallel on each block? Is none of it parallel? Ideally, I would have the loop over blocks done in parallel, but I don't know how to indicate that in the programmable filter (if it's possible).
>
> Any advice would be great,
>
> Tim
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>