[Paraview] Parallel Data Redistribution

Wed Dec 16 13:20:49 EST 2009

Hey John,

> Also : for dynamic load balancing, I'd like to instruct several reader to read the same piece - since the algorithm controls (for example) the particles the algorithm can internally communicate information about what to do amongst its processes, but it can't talk upstream to the readers and fudge them.
>
> I am wondering if there is any way of supporting this kind of thing using the current information keys and my instinct says no.
I guess you can kind of do this with the current "request update" stuff 
but thanks to the flexibility of the pipeline information key,values you 
can also roll your own very easy.

I recently implemented dynamic load balancing in a new stream line 
tracer. To get the work load balanced its crucial that each process have 
to have on demand access to the entire data set. I accomplished it with 
information keys and by using a "meta-reader" in place of the 
traditional paraview reader. The meta reader does two things, it 
populates the new keys and it gives PV a dummy dataset that is one cell 
per process such that the bounds, shape, and array names are the same as 
the real dataset which is not read during the meta-reader execution. 
When the stream tracer executes downstream of the meta-reader he picks 
the keys out of the pipeline information. The important key,value is an 
out-of-core (ooc) reader. The ooc reader is a vtkDataObject so that it 
can be passed through the information. Once the stream tracer has it he 
can make repeated IO requests as particles move through the dataset as 
needed. My interface accepts a point and returns a chunk of data. The 
ooc reader internally handles caching and memory management. In this way 
you can keep all processes busy all the time when tracing stream lines. 
The approach worked out well and was very simple to implement, with no 
modification to the executive. Also the filter has control of caching, 
and can free all the memory at the end of its execution which reduces 
significantly the memory footprint compared to the traditional PV 
reader. And I need not worry if PV or some upstream filter uses MPI 
communications in between during my IO requests. There is a little more 
to our scheduling algorithm which I wont discus now but so far for 
making poincare maps we scaled well up to 2E7 stream lines per frame and 
96 processes and we minimize the memory footprint which is important to us.

Berk and Ken already basically gave you all the options you need but I 
add this because it shows how flexible and powerful the pipeline 
information really is.

Burlen

Biddiscombe, John A. wrote:
> Berk,
>
> We had a discussion back in 2008, which resides here http://www.cmake.org/pipermail/paraview/2008-May/008170.html
>
> Continuing from this, my question of the other day, touches on the same problem.
>
> I'd like to manipulate the piece number read by each reader. As mentioned before, UPDATE_PIECE is not passed into RequestInformation at first (since nobody knows how many pieces there are yet!), so I can't (directly) generate information in the reader which is 'piece dependent'. And I can't be sure that someone doing streaming won't interfere with piece numbers when using the code differently.
>
> For the particle tracer (for example), I'd like to tell the upstream pipeline to read no pieces when certain processes are empty of particles (currently they update and generate{=read} data when they don't need to). I may be able to suppress the forward upstream somehow, but I don't know of an easy way for the algorithm to say "Stop" to the executive to prevent it updating if the timestep changes, but the algorithm has determined that no processing is required (ForwardUpstream of Requests continues unabated). I'd like to set the UPdatePiece to -1 to tell the executive to stop operating.
>
> Also : for dynamic load balancing, I'd like to instruct several reader to read the same piece - since the algorithm controls (for example) the particles the algorithm can internally communicate information about what to do amongst its processes, but it can't talk upstream to the readers and fudge them.
>
> I am wondering if there is any way of supporting this kind of thing using the current information keys and my instinct says no. It seems like the update pice and numpieces were really intended for streaming and we need two kinds of 'pieces', one for streaming, another for splitting in _parallel_ because they aren't quite the same. (Please note that I haven't actually tried changing piece requests in the algorithms yet, so I'm only guessing that it won't work properly)
>
> <cough>
> UPDATE_STREAM_PIECE
> UPDATE_PARALLEL_PIECE 
> <\cough>
>
> Comments?
>
> JB
>
>
>   
>> I would have the reader (most parallel readers do this) generate empty
>> data on all processes of id >= N. Then your filter can redistribute
>> from those N processes to all M processes. I am pretty sure
>> RedistributePolyData can do this for polydata as long as you set the
>> weight to 1 on all processes. Ditto for D3.
>>
>> -berk
>>
>> On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A. <biddisco at cscs.ch>
>> wrote:
>>     
>>> Berk
>>>
>>>       
>>>> It sounds like M is equal to the number of processors (pipelines) and
>>>> M >> N. Is that correct?
>>>>         
>>> Yes, That's the idea. N blocks, broken (in place) into M new blocks, then
>>>       
>> fanned out to the M processes downstream where they can be processed
>> separately . If it were on a single node, then each block could be a
>> separate 'connection' to a downstream filter, but distributed, an explicit
>> send is needed.
>>     
>>> JB
>>>
>>>       
>>>> -berk
>>>>
>>>> On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A. <biddisco at cscs.ch>
>>>> wrote:
>>>>         
>>>>> Berk
>>>>>
>>>>> The data will be UnstructuredGrid for now. Multiblock, but actually, I
>>>>>           
>>>> don't really care what each block is, only that I accept one block on
>>>>         
>> each
>>     
>>>> of N processes, split it into more pieces, and the next filter accepts
>>>>         
>> one
>>     
>>>> (or more if the numbers don't match up nicely) blocks and process them.
>>>>         
>> The
>>     
>>>> redistribution shouldn't care what data types, only how many blocks in
>>>>         
>> and
>>     
>>>> out.
>>>>         
>>>>> Looking at RedistributePolyData makes me realize my initial idea is no
>>>>>           
>>>> good. In my mind I had a pipeline where multiblock datasets are passed
>>>>         
>> down
>>     
>>>> the pipeline and simply the number of pieces is manipulated to achieve
>>>>         
>> what
>>     
>>>> I wanted - but I see now that if I have M pieces downstream mapped
>>>>         
>> upstream
>>     
>>>> to N pieces, what will happen is the readers will be effectively
>>>>         
>> duplicated
>>     
>>>> and M/N readers will read the same pieces. I don't want this to happen as
>>>>         
>> IO
>>     
>>>> will be a big problem if readers read the same blocks M/N times.
>>>>         
>>>>> I was hoping there was a way of simply instructing the pipeline to
>>>>>           
>> manage
>>     
>>>> the pieces, but I see now that this won't work, as there needs to be a
>>>> specific Send from each N to their M/N receivers (because the data is
>>>> physically in another process, so the pipeline can't see it). This is
>>>>         
>> very
>>     
>>>> annoying as there must be a class which already does this (block
>>>> redistribution, rather than polygon level redistribution), and I would
>>>>         
>> like
>>     
>>>> it to be more 'pipeline integrated' so that the user doesn't have to
>>>> explicitly send each time an algorithm needs it.
>>>>         
>>>>> I'll go through RedistributePolyData in depth and see what I can pull
>>>>>           
>> out
>>     
>>>> of it - please feel free to steer me towards another possibility :)
>>>>         
>>>>> JB
>>>>>
>>>>>
>>>>>           
>>>>>> -----Original Message-----
>>>>>> From: Berk Geveci [mailto:berk.geveci at kitware.com]
>>>>>> Sent: 11 December 2009 16:09
>>>>>> To: Biddiscombe, John A.
>>>>>> Cc: paraview at paraview.org
>>>>>> Subject: Re: [Paraview] Parallel Data Redistribution
>>>>>>
>>>>>> What is the data type? vtkRedistributePolyData and its subclasses do
>>>>>> this for polydata. It can do load balancing (where you can specify a
>>>>>> weight for each processor) as well.
>>>>>>
>>>>>> -berk
>>>>>>
>>>>>> On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A.
>>>>>>             
>> <biddisco at cscs.ch>
>>     
>>>>>> wrote:
>>>>>>             
>>>>>>> I have a filter pipeline which reads N blocks from disk, this works
>>>>>>>               
>>>> fine
>>>>         
>>>>>> on N processors.
>>>>>>             
>>>>>>> I now wish to subdivide those N blocks (using a custom filter) to
>>>>>>>               
>>>> produce
>>>>         
>>>>>> new data which will consist of M blocks - where M >> N.
>>>>>>             
>>>>>>> I wish to run the algorithm on M processors and have the piece
>>>>>>>               
>>>> information
>>>>         
>>>>>> transformed between the two filters (reader -> splitter), so that
>>>>>>             
>> blocks
>>     
>>>> are
>>>>         
>>>>>> distributed correctly. The reader will Read N blocks (leaving M-N
>>>>>>             
>>>> processes
>>>>         
>>>>>> unoccupied), but the filter which splits them up needs to output a
>>>>>>             
>>>> different
>>>>         
>>>>>> number of pieces and have the full M processes receiving data.
>>>>>>             
>>>>>>> I have a reasonably good idea of how to implement this, but I'm
>>>>>>>               
>>>> wondering
>>>>         
>>>>>> if any filters already do something similar. I will of course take
>>>>>>             
>> apart
>>     
>>>> the
>>>>         
>>>>>> D3 filter for ideas, but I don't need to do a parallel spatial
>>>>>>             
>>>> decomposition
>>>>         
>>>>>> since my blocks are already discrete - I just want to redistribute the
>>>>>> blocks around and more importantly change the numbers of them between
>>>>>> filters.
>>>>>>             
>>>>>>> If anyone can suggest examples which do this already, please do
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> JB
>>>>>>>
>>>>>>> --
>>>>>>> John Biddiscombe,                            email:biddisco @
>>>>>>>               
>> cscs.ch
>>     
>>>>>>> http://www.cscs.ch/
>>>>>>> CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
>>>>>>>               
>> 610.82.07
>>     
>>>>>>> Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
>>>>>>>               
>> 610.82.82
>>     
>>>>>>> _______________________________________________
>>>>>>> Powered by www.kitware.com
>>>>>>>
>>>>>>> Visit other Kitware open-source projects at
>>>>>>>               
>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>             
>>>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>>>>               
>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>             
>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>>
>>>>>>>               
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>