[Paraview] Parallel Data Redistribution

Wed Dec 16 16:23:35 EST 2009

oops typo: The ooc reader is a vtkObject.

burlen wrote:
> Hey John,
>
>> Also : for dynamic load balancing, I'd like to instruct several 
>> reader to read the same piece - since the algorithm controls (for 
>> example) the particles the algorithm can internally communicate 
>> information about what to do amongst its processes, but it can't talk 
>> upstream to the readers and fudge them.
>>
>> I am wondering if there is any way of supporting this kind of thing 
>> using the current information keys and my instinct says no.
> I guess you can kind of do this with the current "request update" 
> stuff but thanks to the flexibility of the pipeline information 
> key,values you can also roll your own very easy.
>
> I recently implemented dynamic load balancing in a new stream line 
> tracer. To get the work load balanced its crucial that each process 
> have to have on demand access to the entire data set. I accomplished 
> it with information keys and by using a "meta-reader" in place of the 
> traditional paraview reader. The meta reader does two things, it 
> populates the new keys and it gives PV a dummy dataset that is one 
> cell per process such that the bounds, shape, and array names are the 
> same as the real dataset which is not read during the meta-reader 
> execution. When the stream tracer executes downstream of the 
> meta-reader he picks the keys out of the pipeline information. The 
> important key,value is an out-of-core (ooc) reader.  so that it can be 
> passed through the information. Once the stream tracer has it he can 
> make repeated IO requests as particles move through the dataset as 
> needed. My interface accepts a point and returns a chunk of data. The 
> ooc reader internally handles caching and memory management. In this 
> way you can keep all processes busy all the time when tracing stream 
> lines. The approach worked out well and was very simple to implement, 
> with no modification to the executive. Also the filter has control of 
> caching, and can free all the memory at the end of its execution which 
> reduces significantly the memory footprint compared to the traditional 
> PV reader. And I need not worry if PV or some upstream filter uses MPI 
> communications in between during my IO requests. There is a little 
> more to our scheduling algorithm which I wont discus now but so far 
> for making poincare maps we scaled well up to 2E7 stream lines per 
> frame and 96 processes and we minimize the memory footprint which is 
> important to us.
>
> Berk and Ken already basically gave you all the options you need but I 
> add this because it shows how flexible and powerful the pipeline 
> information really is.
>
> Burlen
>
> Biddiscombe, John A. wrote:
>> Berk,
>>
>> We had a discussion back in 2008, which resides here 
>> http://www.cmake.org/pipermail/paraview/2008-May/008170.html
>>
>> Continuing from this, my question of the other day, touches on the 
>> same problem.
>>
>> I'd like to manipulate the piece number read by each reader. As 
>> mentioned before, UPDATE_PIECE is not passed into RequestInformation 
>> at first (since nobody knows how many pieces there are yet!), so I 
>> can't (directly) generate information in the reader which is 'piece 
>> dependent'. And I can't be sure that someone doing streaming won't 
>> interfere with piece numbers when using the code differently.
>>
>> For the particle tracer (for example), I'd like to tell the upstream 
>> pipeline to read no pieces when certain processes are empty of 
>> particles (currently they update and generate{=read} data when they 
>> don't need to). I may be able to suppress the forward upstream 
>> somehow, but I don't know of an easy way for the algorithm to say 
>> "Stop" to the executive to prevent it updating if the timestep 
>> changes, but the algorithm has determined that no processing is 
>> required (ForwardUpstream of Requests continues unabated). I'd like 
>> to set the UPdatePiece to -1 to tell the executive to stop operating.
>>
>> Also : for dynamic load balancing, I'd like to instruct several 
>> reader to read the same piece - since the algorithm controls (for 
>> example) the particles the algorithm can internally communicate 
>> information about what to do amongst its processes, but it can't talk 
>> upstream to the readers and fudge them.
>>
>> I am wondering if there is any way of supporting this kind of thing 
>> using the current information keys and my instinct says no. It seems 
>> like the update pice and numpieces were really intended for streaming 
>> and we need two kinds of 'pieces', one for streaming, another for 
>> splitting in _parallel_ because they aren't quite the same. (Please 
>> note that I haven't actually tried changing piece requests in the 
>> algorithms yet, so I'm only guessing that it won't work properly)
>>
>> <cough>
>> UPDATE_STREAM_PIECE
>> UPDATE_PARALLEL_PIECE <\cough>
>>
>> Comments?
>>
>> JB
>>
>>
>>  
>>> I would have the reader (most parallel readers do this) generate empty
>>> data on all processes of id >= N. Then your filter can redistribute
>>> from those N processes to all M processes. I am pretty sure
>>> RedistributePolyData can do this for polydata as long as you set the
>>> weight to 1 on all processes. Ditto for D3.
>>>
>>> -berk
>>>
>>> On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A. 
>>> <biddisco at cscs.ch>
>>> wrote:
>>>    
>>>> Berk
>>>>
>>>>      
>>>>> It sounds like M is equal to the number of processors (pipelines) and
>>>>> M >> N. Is that correct?
>>>>>         
>>>> Yes, That's the idea. N blocks, broken (in place) into M new 
>>>> blocks, then
>>>>       
>>> fanned out to the M processes downstream where they can be processed
>>> separately . If it were on a single node, then each block could be a
>>> separate 'connection' to a downstream filter, but distributed, an 
>>> explicit
>>> send is needed.
>>>    
>>>> JB
>>>>
>>>>      
>>>>> -berk
>>>>>
>>>>> On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A. 
>>>>> <biddisco at cscs.ch>
>>>>> wrote:
>>>>>        
>>>>>> Berk
>>>>>>
>>>>>> The data will be UnstructuredGrid for now. Multiblock, but 
>>>>>> actually, I
>>>>>>           
>>>>> don't really care what each block is, only that I accept one block on
>>>>>         
>>> each
>>>    
>>>>> of N processes, split it into more pieces, and the next filter 
>>>>> accepts
>>>>>         
>>> one
>>>    
>>>>> (or more if the numbers don't match up nicely) blocks and process 
>>>>> them.
>>>>>         
>>> The
>>>    
>>>>> redistribution shouldn't care what data types, only how many 
>>>>> blocks in
>>>>>         
>>> and
>>>    
>>>>> out.
>>>>>        
>>>>>> Looking at RedistributePolyData makes me realize my initial idea 
>>>>>> is no
>>>>>>           
>>>>> good. In my mind I had a pipeline where multiblock datasets are 
>>>>> passed
>>>>>         
>>> down
>>>    
>>>>> the pipeline and simply the number of pieces is manipulated to 
>>>>> achieve
>>>>>         
>>> what
>>>    
>>>>> I wanted - but I see now that if I have M pieces downstream mapped
>>>>>         
>>> upstream
>>>    
>>>>> to N pieces, what will happen is the readers will be effectively
>>>>>         
>>> duplicated
>>>    
>>>>> and M/N readers will read the same pieces. I don't want this to 
>>>>> happen as
>>>>>         
>>> IO
>>>    
>>>>> will be a big problem if readers read the same blocks M/N times.
>>>>>        
>>>>>> I was hoping there was a way of simply instructing the pipeline to
>>>>>>           
>>> manage
>>>    
>>>>> the pieces, but I see now that this won't work, as there needs to 
>>>>> be a
>>>>> specific Send from each N to their M/N receivers (because the data is
>>>>> physically in another process, so the pipeline can't see it). This is
>>>>>         
>>> very
>>>    
>>>>> annoying as there must be a class which already does this (block
>>>>> redistribution, rather than polygon level redistribution), and I 
>>>>> would
>>>>>         
>>> like
>>>    
>>>>> it to be more 'pipeline integrated' so that the user doesn't have to
>>>>> explicitly send each time an algorithm needs it.
>>>>>        
>>>>>> I'll go through RedistributePolyData in depth and see what I can 
>>>>>> pull
>>>>>>           
>>> out
>>>    
>>>>> of it - please feel free to steer me towards another possibility :)
>>>>>        
>>>>>> JB
>>>>>>
>>>>>>
>>>>>>          
>>>>>>> -----Original Message-----
>>>>>>> From: Berk Geveci [mailto:berk.geveci at kitware.com]
>>>>>>> Sent: 11 December 2009 16:09
>>>>>>> To: Biddiscombe, John A.
>>>>>>> Cc: paraview at paraview.org
>>>>>>> Subject: Re: [Paraview] Parallel Data Redistribution
>>>>>>>
>>>>>>> What is the data type? vtkRedistributePolyData and its 
>>>>>>> subclasses do
>>>>>>> this for polydata. It can do load balancing (where you can 
>>>>>>> specify a
>>>>>>> weight for each processor) as well.
>>>>>>>
>>>>>>> -berk
>>>>>>>
>>>>>>> On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A.
>>>>>>>             
>>> <biddisco at cscs.ch>
>>>    
>>>>>>> wrote:
>>>>>>>            
>>>>>>>> I have a filter pipeline which reads N blocks from disk, this 
>>>>>>>> works
>>>>>>>>               
>>>>> fine
>>>>>        
>>>>>>> on N processors.
>>>>>>>            
>>>>>>>> I now wish to subdivide those N blocks (using a custom filter) to
>>>>>>>>               
>>>>> produce
>>>>>        
>>>>>>> new data which will consist of M blocks - where M >> N.
>>>>>>>            
>>>>>>>> I wish to run the algorithm on M processors and have the piece
>>>>>>>>               
>>>>> information
>>>>>        
>>>>>>> transformed between the two filters (reader -> splitter), so that
>>>>>>>             
>>> blocks
>>>    
>>>>> are
>>>>>        
>>>>>>> distributed correctly. The reader will Read N blocks (leaving M-N
>>>>>>>             
>>>>> processes
>>>>>        
>>>>>>> unoccupied), but the filter which splits them up needs to output a
>>>>>>>             
>>>>> different
>>>>>        
>>>>>>> number of pieces and have the full M processes receiving data.
>>>>>>>            
>>>>>>>> I have a reasonably good idea of how to implement this, but I'm
>>>>>>>>               
>>>>> wondering
>>>>>        
>>>>>>> if any filters already do something similar. I will of course take
>>>>>>>             
>>> apart
>>>    
>>>>> the
>>>>>        
>>>>>>> D3 filter for ideas, but I don't need to do a parallel spatial
>>>>>>>             
>>>>> decomposition
>>>>>        
>>>>>>> since my blocks are already discrete - I just want to 
>>>>>>> redistribute the
>>>>>>> blocks around and more importantly change the numbers of them 
>>>>>>> between
>>>>>>> filters.
>>>>>>>            
>>>>>>>> If anyone can suggest examples which do this already, please do
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> JB
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> John Biddiscombe,                            email:biddisco @
>>>>>>>>               
>>> cscs.ch
>>>    
>>>>>>>> http://www.cscs.ch/
>>>>>>>> CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
>>>>>>>>               
>>> 610.82.07
>>>    
>>>>>>>> Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
>>>>>>>>               
>>> 610.82.82
>>>    
>>>>>>>> _______________________________________________
>>>>>>>> Powered by www.kitware.com
>>>>>>>>
>>>>>>>> Visit other Kitware open-source projects at
>>>>>>>>               
>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>            
>>>>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>>>>>               
>>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>>            
>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>>>
>>>>>>>>               
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at 
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the ParaView Wiki at: 
>> http://paraview.org/Wiki/ParaView
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.paraview.org/mailman/listinfo/paraview
>>   
>