[Paraview] Parallel Data Redistribution

Berk Geveci berk.geveci at kitware.com
Tue Dec 15 15:33:16 EST 2009


You can do this without making any changes to the executives. Set
UPDATE_NO_DATA in RequestUpdateExtent(), also add it to the request's
KEYS_TO_COPY entry. The executive would then copy it up the pipeline
all the way to the reader. One possible issue: if you want to make a
new pass without UPDATA_NO_DATA, you have to manually modify reader as
NeedToExecuteData() would not take UPDATA_NO_DATA into account. I
would like to have a general mechanism in the pipeline that causes it
to re-execute if the user requests a different value than a previous
one in such cases. It would be pretty straightforward but I never had
the time.

-berk

On Tue, Dec 15, 2009 at 3:00 PM, Moreland, Kenneth <kmorel at sandia.gov> wrote:
> In either case, I’m thinking this is not a great idea if the filter John is
> writing is intended to be used with anything other than his special reader.
>  Otherwise, the reader is going to get the request and do something
> unpredictable.
>
> Maybe a safer approach is to make a new key like UPDATE_NO_DATA.  The filter
> could then set request some arbitrary piece and number of pieces and also
> add UPDATE_NO_DATA to its input.  If the reader understand UPDATE_NO_DATA,
> it will not actually read the data requested.  If it does not understand, it
> will simply read a piece that is ignored: no big deal.  That said, is this
> adding unnecessary complications to the pipeline mechanisms (which are
> already difficult for reader/filter designers)?
>
> -Ken
>
>
> On 12/15/09 12:48 PM, "Berk Geveci" <berk.geveci at kitware.com> wrote:
>
> In the past we had the pipeline execution stop if update number of
> pieces was 0. We can easily add that back in. It would be a one line
> change in vtkStreamingDemandDrivePipeline::NeedToExecuteData().
>
> Alternatively, the reader could check for UPDATE_NUMBER_OF_PIECES and
> do nothing it is is 0. I think that's better that requested piece
> being -1 which is ambiguous.
>
> -berk
>
> On Tue, Dec 15, 2009 at 2:41 PM, Moreland, Kenneth <kmorel at sandia.gov>
> wrote:
>> I think the part that John wants that is missing is the ability for a
>> filter
>> to say I want no data (or simply, I don’t care).  The idea being that
>> during
>> the RequestUpdateExtent phase you could send a flag to tell the pipeline
>> to
>> not propagate the request upstream.  Instead, simply call RequestData and
>> propagate back down.  However, like Berk said, this is a bad idea because
>> any upstream objects that do parallel communication will lock up.  I would
>> hope that your reader itself does parallel communication.  You’re going to
>> get contention on your disks otherwise as different processes request the
>> same pieces.
>>
>> I suppose the filter could also simply request piece –1 and the reader
>> could
>> take that as a flag to read nothing.  Assuming neither the executive or
>> another filter tries to do anything with the piece number you should be
>> OK,
>> but I don’t know if that will just work.
>>
>> -Ken
>>
>>
>> On 12/15/09 9:21 AM, "Berk Geveci" <berk.geveci at kitware.com> wrote:
>>
>> Hmmm. I don't exactly see the difference between the two (I mean
>> streaming and distributing). In your case, I would think that a filter
>> can arbitrarily change the pipeline request to either request nothing
>> or ask the same piece on two processes. It would like this:
>>
>> reader0 - request piece 0 of 1 - filter - request piece 0 of 1 -
>> update suppressor
>> reader1 - request piece 0 of 1 - filter - request piece 1 of 1 -
>> update suppressor
>>
>> This should work fine as long as the reader does not have another
>> consumer (for example a representation if the visibility is on) that
>> is asking for a different piece. The following should also work
>>
>> reader0 - request piece 0 of 1 - filter - request piece 0 of 1 -
>> update suppressor
>> reader1 - request piece 0 of 0 - filter - request piece 1 of 1 -
>> update suppressor
>>
>> Of course, if you have any filters that do parallel communication in
>> between, you are SOL.
>>
>> Does that make sense?
>>
>> Now if you have structured data and the frickin' extent translator
>> gets in between, things are not as straightforward - which why I hate
>> the extent translator. If I had some time/funding, I would write a new
>> executive that does not use structured extents.
>>
>> -berk
>>
>> On Tue, Dec 15, 2009 at 11:07 AM, Biddiscombe, John A. <biddisco at cscs.ch>
>> wrote:
>>> Berk,
>>>
>>> We had a discussion back in 2008, which resides here
>>> http://www.cmake.org/pipermail/paraview/2008-May/008170.html
>>>
>>> Continuing from this, my question of the other day, touches on the same
>>> problem.
>>>
>>> I'd like to manipulate the piece number read by each reader. As mentioned
>>> before, UPDATE_PIECE is not passed into RequestInformation at first
>>> (since
>>> nobody knows how many pieces there are yet!), so I can't (directly)
>>> generate
>>> information in the reader which is 'piece dependent'. And I can't be sure
>>> that someone doing streaming won't interfere with piece numbers when
>>> using
>>> the code differently.
>>>
>>> For the particle tracer (for example), I'd like to tell the upstream
>>> pipeline to read no pieces when certain processes are empty of particles
>>> (currently they update and generate{=read} data when they don't need to).
>>> I
>>> may be able to suppress the forward upstream somehow, but I don't know of
>>> an
>>> easy way for the algorithm to say "Stop" to the executive to prevent it
>>> updating if the timestep changes, but the algorithm has determined that
>>> no
>>> processing is required (ForwardUpstream of Requests continues unabated).
>>> I'd
>>> like to set the UPdatePiece to -1 to tell the executive to stop
>>> operating.
>>>
>>> Also : for dynamic load balancing, I'd like to instruct several reader to
>>> read the same piece - since the algorithm controls (for example) the
>>> particles the algorithm can internally communicate information about what
>>> to
>>> do amongst its processes, but it can't talk upstream to the readers and
>>> fudge them.
>>>
>>> I am wondering if there is any way of supporting this kind of thing using
>>> the current information keys and my instinct says no. It seems like the
>>> update pice and numpieces were really intended for streaming and we need
>>> two
>>> kinds of 'pieces', one for streaming, another for splitting in _parallel_
>>> because they aren't quite the same. (Please note that I haven't actually
>>> tried changing piece requests in the algorithms yet, so I'm only guessing
>>> that it won't work properly)
>>>
>>> <cough>
>>> UPDATE_STREAM_PIECE
>>> UPDATE_PARALLEL_PIECE
>>> <\cough>
>>>
>>> Comments?
>>>
>>> JB
>>>
>>>
>>>>
>>>> I would have the reader (most parallel readers do this) generate empty
>>>> data on all processes of id >= N. Then your filter can redistribute
>>>> from those N processes to all M processes. I am pretty sure
>>>> RedistributePolyData can do this for polydata as long as you set the
>>>> weight to 1 on all processes. Ditto for D3.
>>>>
>>>> -berk
>>>>
>>>> On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A. <biddisco at cscs.ch>
>>>> wrote:
>>>> > Berk
>>>> >
>>>> >> It sounds like M is equal to the number of processors (pipelines) and
>>>> >> M >> N. Is that correct?
>>>> >
>>>> > Yes, That's the idea. N blocks, broken (in place) into M new blocks,
>>>> > then
>>>> fanned out to the M processes downstream where they can be processed
>>>> separately . If it were on a single node, then each block could be a
>>>> separate 'connection' to a downstream filter, but distributed, an
>>>> explicit
>>>> send is needed.
>>>> >
>>>> > JB
>>>> >
>>>> >>
>>>> >> -berk
>>>> >>
>>>> >> On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A.
>>>> >> <biddisco at cscs.ch>
>>>> >> wrote:
>>>> >> > Berk
>>>> >> >
>>>> >> > The data will be UnstructuredGrid for now. Multiblock, but
>>>> >> > actually,
>>>> >> > I
>>>> >> don't really care what each block is, only that I accept one block on
>>>> each
>>>> >> of N processes, split it into more pieces, and the next filter
>>>> >> accepts
>>>> one
>>>> >> (or more if the numbers don't match up nicely) blocks and process
>>>> >> them.
>>>> The
>>>> >> redistribution shouldn't care what data types, only how many blocks
>>>> >> in
>>>> and
>>>> >> out.
>>>> >> >
>>>> >> > Looking at RedistributePolyData makes me realize my initial idea is
>>>> >> > no
>>>> >> good. In my mind I had a pipeline where multiblock datasets are
>>>> >> passed
>>>> down
>>>> >> the pipeline and simply the number of pieces is manipulated to
>>>> >> achieve
>>>> what
>>>> >> I wanted - but I see now that if I have M pieces downstream mapped
>>>> upstream
>>>> >> to N pieces, what will happen is the readers will be effectively
>>>> duplicated
>>>> >> and M/N readers will read the same pieces. I don't want this to
>>>> >> happen
>>>> >> as
>>>> IO
>>>> >> will be a big problem if readers read the same blocks M/N times.
>>>> >> > I was hoping there was a way of simply instructing the pipeline to
>>>> manage
>>>> >> the pieces, but I see now that this won't work, as there needs to be
>>>> >> a
>>>> >> specific Send from each N to their M/N receivers (because the data is
>>>> >> physically in another process, so the pipeline can't see it). This is
>>>> very
>>>> >> annoying as there must be a class which already does this (block
>>>> >> redistribution, rather than polygon level redistribution), and I
>>>> >> would
>>>> like
>>>> >> it to be more 'pipeline integrated' so that the user doesn't have to
>>>> >> explicitly send each time an algorithm needs it.
>>>> >> >
>>>> >> > I'll go through RedistributePolyData in depth and see what I can
>>>> >> > pull
>>>> out
>>>> >> of it - please feel free to steer me towards another possibility :)
>>>> >> >
>>>> >> > JB
>>>> >> >
>>>> >> >
>>>> >> >> -----Original Message-----
>>>> >> >> From: Berk Geveci [mailto:berk.geveci at kitware.com]
>>>> >> >> Sent: 11 December 2009 16:09
>>>> >> >> To: Biddiscombe, John A.
>>>> >> >> Cc: paraview at paraview.org
>>>> >> >> Subject: Re: [Paraview] Parallel Data Redistribution
>>>> >> >>
>>>> >> >> What is the data type? vtkRedistributePolyData and its subclasses
>>>> >> >> do
>>>> >> >> this for polydata. It can do load balancing (where you can specify
>>>> >> >> a
>>>> >> >> weight for each processor) as well.
>>>> >> >>
>>>> >> >> -berk
>>>> >> >>
>>>> >> >> On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A.
>>>> <biddisco at cscs.ch>
>>>> >> >> wrote:
>>>> >> >> > I have a filter pipeline which reads N blocks from disk, this
>>>> >> >> > works
>>>> >> fine
>>>> >> >> on N processors.
>>>> >> >> >
>>>> >> >> > I now wish to subdivide those N blocks (using a custom filter)
>>>> >> >> > to
>>>> >> produce
>>>> >> >> new data which will consist of M blocks - where M >> N.
>>>> >> >> >
>>>> >> >> > I wish to run the algorithm on M processors and have the piece
>>>> >> information
>>>> >> >> transformed between the two filters (reader -> splitter), so that
>>>> blocks
>>>> >> are
>>>> >> >> distributed correctly. The reader will Read N blocks (leaving M-N
>>>> >> processes
>>>> >> >> unoccupied), but the filter which splits them up needs to output a
>>>> >> different
>>>> >> >> number of pieces and have the full M processes receiving data.
>>>> >> >> >
>>>> >> >> > I have a reasonably good idea of how to implement this, but I'm
>>>> >> wondering
>>>> >> >> if any filters already do something similar. I will of course take
>>>> apart
>>>> >> the
>>>> >> >> D3 filter for ideas, but I don't need to do a parallel spatial
>>>> >> decomposition
>>>> >> >> since my blocks are already discrete - I just want to redistribute
>>>> >> >> the
>>>> >> >> blocks around and more importantly change the numbers of them
>>>> >> >> between
>>>> >> >> filters.
>>>> >> >> >
>>>> >> >> > If anyone can suggest examples which do this already, please do
>>>> >> >> >
>>>> >> >> > Thanks
>>>> >> >> >
>>>> >> >> > JB
>>>> >> >> >
>>>> >> >> > --
>>>> >> >> > John Biddiscombe,                            email:biddisco @
>>>> cscs.ch
>>>> >> >> > http://www.cscs.ch/
>>>> >> >> > CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
>>>> 610.82.07
>>>> >> >> > Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
>>>> 610.82.82
>>>> >> >> >
>>>> >> >> >
>>>> >> >> > _______________________________________________
>>>> >> >> > Powered by www.kitware.com
>>>> >> >> >
>>>> >> >> > Visit other Kitware open-source projects at
>>>> >> >> http://www.kitware.com/opensource/opensource.html
>>>> >> >> >
>>>> >> >> > Please keep messages on-topic and check the ParaView Wiki at:
>>>> >> >> http://paraview.org/Wiki/ParaView
>>>> >> >> >
>>>> >> >> > Follow this link to subscribe/unsubscribe:
>>>> >> >> > http://www.paraview.org/mailman/listinfo/paraview
>>>> >> >> >
>>>> >> >
>>>> >
>>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the ParaView Wiki at:
>> http://paraview.org/Wiki/ParaView
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.paraview.org/mailman/listinfo/paraview
>>
>>
>>
>>
>>    ****      Kenneth Moreland
>>     ***      Sandia National Laboratories
>> ***********
>> *** *** ***  email: kmorel at sandia.gov
>> **  ***  **  phone: (505) 844-8919
>>     ***      web:   http://www.cs.unm.edu/~kmorel
>>
>>
>
>
>
>
>    ****      Kenneth Moreland
>     ***      Sandia National Laboratories
> ***********
> *** *** ***  email: kmorel at sandia.gov
> **  ***  **  phone: (505) 844-8919
>     ***      web:   http://www.cs.unm.edu/~kmorel
>
>


More information about the ParaView mailing list