[Paraview] Parallel Data Redistribution

Moreland, Kenneth kmorel at sandia.gov
Tue Dec 15 15:00:13 EST 2009


In either case, I'm thinking this is not a great idea if the filter John is writing is intended to be used with anything other than his special reader.  Otherwise, the reader is going to get the request and do something unpredictable.

Maybe a safer approach is to make a new key like UPDATE_NO_DATA.  The filter could then set request some arbitrary piece and number of pieces and also add UPDATE_NO_DATA to its input.  If the reader understand UPDATE_NO_DATA, it will not actually read the data requested.  If it does not understand, it will simply read a piece that is ignored: no big deal.  That said, is this adding unnecessary complications to the pipeline mechanisms (which are already difficult for reader/filter designers)?

-Ken


On 12/15/09 12:48 PM, "Berk Geveci" <berk.geveci at kitware.com> wrote:

In the past we had the pipeline execution stop if update number of
pieces was 0. We can easily add that back in. It would be a one line
change in vtkStreamingDemandDrivePipeline::NeedToExecuteData().

Alternatively, the reader could check for UPDATE_NUMBER_OF_PIECES and
do nothing it is is 0. I think that's better that requested piece
being -1 which is ambiguous.

-berk

On Tue, Dec 15, 2009 at 2:41 PM, Moreland, Kenneth <kmorel at sandia.gov> wrote:
> I think the part that John wants that is missing is the ability for a filter
> to say I want no data (or simply, I don't care).  The idea being that during
> the RequestUpdateExtent phase you could send a flag to tell the pipeline to
> not propagate the request upstream.  Instead, simply call RequestData and
> propagate back down.  However, like Berk said, this is a bad idea because
> any upstream objects that do parallel communication will lock up.  I would
> hope that your reader itself does parallel communication.  You're going to
> get contention on your disks otherwise as different processes request the
> same pieces.
>
> I suppose the filter could also simply request piece -1 and the reader could
> take that as a flag to read nothing.  Assuming neither the executive or
> another filter tries to do anything with the piece number you should be OK,
> but I don't know if that will just work.
>
> -Ken
>
>
> On 12/15/09 9:21 AM, "Berk Geveci" <berk.geveci at kitware.com> wrote:
>
> Hmmm. I don't exactly see the difference between the two (I mean
> streaming and distributing). In your case, I would think that a filter
> can arbitrarily change the pipeline request to either request nothing
> or ask the same piece on two processes. It would like this:
>
> reader0 - request piece 0 of 1 - filter - request piece 0 of 1 -
> update suppressor
> reader1 - request piece 0 of 1 - filter - request piece 1 of 1 -
> update suppressor
>
> This should work fine as long as the reader does not have another
> consumer (for example a representation if the visibility is on) that
> is asking for a different piece. The following should also work
>
> reader0 - request piece 0 of 1 - filter - request piece 0 of 1 -
> update suppressor
> reader1 - request piece 0 of 0 - filter - request piece 1 of 1 -
> update suppressor
>
> Of course, if you have any filters that do parallel communication in
> between, you are SOL.
>
> Does that make sense?
>
> Now if you have structured data and the frickin' extent translator
> gets in between, things are not as straightforward - which why I hate
> the extent translator. If I had some time/funding, I would write a new
> executive that does not use structured extents.
>
> -berk
>
> On Tue, Dec 15, 2009 at 11:07 AM, Biddiscombe, John A. <biddisco at cscs.ch>
> wrote:
>> Berk,
>>
>> We had a discussion back in 2008, which resides here
>> http://www.cmake.org/pipermail/paraview/2008-May/008170.html
>>
>> Continuing from this, my question of the other day, touches on the same
>> problem.
>>
>> I'd like to manipulate the piece number read by each reader. As mentioned
>> before, UPDATE_PIECE is not passed into RequestInformation at first (since
>> nobody knows how many pieces there are yet!), so I can't (directly) generate
>> information in the reader which is 'piece dependent'. And I can't be sure
>> that someone doing streaming won't interfere with piece numbers when using
>> the code differently.
>>
>> For the particle tracer (for example), I'd like to tell the upstream
>> pipeline to read no pieces when certain processes are empty of particles
>> (currently they update and generate{=read} data when they don't need to). I
>> may be able to suppress the forward upstream somehow, but I don't know of an
>> easy way for the algorithm to say "Stop" to the executive to prevent it
>> updating if the timestep changes, but the algorithm has determined that no
>> processing is required (ForwardUpstream of Requests continues unabated). I'd
>> like to set the UPdatePiece to -1 to tell the executive to stop operating.
>>
>> Also : for dynamic load balancing, I'd like to instruct several reader to
>> read the same piece - since the algorithm controls (for example) the
>> particles the algorithm can internally communicate information about what to
>> do amongst its processes, but it can't talk upstream to the readers and
>> fudge them.
>>
>> I am wondering if there is any way of supporting this kind of thing using
>> the current information keys and my instinct says no. It seems like the
>> update pice and numpieces were really intended for streaming and we need two
>> kinds of 'pieces', one for streaming, another for splitting in _parallel_
>> because they aren't quite the same. (Please note that I haven't actually
>> tried changing piece requests in the algorithms yet, so I'm only guessing
>> that it won't work properly)
>>
>> <cough>
>> UPDATE_STREAM_PIECE
>> UPDATE_PARALLEL_PIECE
>> <\cough>
>>
>> Comments?
>>
>> JB
>>
>>
>>>
>>> I would have the reader (most parallel readers do this) generate empty
>>> data on all processes of id >= N. Then your filter can redistribute
>>> from those N processes to all M processes. I am pretty sure
>>> RedistributePolyData can do this for polydata as long as you set the
>>> weight to 1 on all processes. Ditto for D3.
>>>
>>> -berk
>>>
>>> On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A. <biddisco at cscs.ch>
>>> wrote:
>>> > Berk
>>> >
>>> >> It sounds like M is equal to the number of processors (pipelines) and
>>> >> M >> N. Is that correct?
>>> >
>>> > Yes, That's the idea. N blocks, broken (in place) into M new blocks,
>>> > then
>>> fanned out to the M processes downstream where they can be processed
>>> separately . If it were on a single node, then each block could be a
>>> separate 'connection' to a downstream filter, but distributed, an
>>> explicit
>>> send is needed.
>>> >
>>> > JB
>>> >
>>> >>
>>> >> -berk
>>> >>
>>> >> On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A.
>>> >> <biddisco at cscs.ch>
>>> >> wrote:
>>> >> > Berk
>>> >> >
>>> >> > The data will be UnstructuredGrid for now. Multiblock, but actually,
>>> >> > I
>>> >> don't really care what each block is, only that I accept one block on
>>> each
>>> >> of N processes, split it into more pieces, and the next filter accepts
>>> one
>>> >> (or more if the numbers don't match up nicely) blocks and process
>>> >> them.
>>> The
>>> >> redistribution shouldn't care what data types, only how many blocks in
>>> and
>>> >> out.
>>> >> >
>>> >> > Looking at RedistributePolyData makes me realize my initial idea is
>>> >> > no
>>> >> good. In my mind I had a pipeline where multiblock datasets are passed
>>> down
>>> >> the pipeline and simply the number of pieces is manipulated to achieve
>>> what
>>> >> I wanted - but I see now that if I have M pieces downstream mapped
>>> upstream
>>> >> to N pieces, what will happen is the readers will be effectively
>>> duplicated
>>> >> and M/N readers will read the same pieces. I don't want this to happen
>>> >> as
>>> IO
>>> >> will be a big problem if readers read the same blocks M/N times.
>>> >> > I was hoping there was a way of simply instructing the pipeline to
>>> manage
>>> >> the pieces, but I see now that this won't work, as there needs to be a
>>> >> specific Send from each N to their M/N receivers (because the data is
>>> >> physically in another process, so the pipeline can't see it). This is
>>> very
>>> >> annoying as there must be a class which already does this (block
>>> >> redistribution, rather than polygon level redistribution), and I would
>>> like
>>> >> it to be more 'pipeline integrated' so that the user doesn't have to
>>> >> explicitly send each time an algorithm needs it.
>>> >> >
>>> >> > I'll go through RedistributePolyData in depth and see what I can
>>> >> > pull
>>> out
>>> >> of it - please feel free to steer me towards another possibility :)
>>> >> >
>>> >> > JB
>>> >> >
>>> >> >
>>> >> >> -----Original Message-----
>>> >> >> From: Berk Geveci [mailto:berk.geveci at kitware.com]
>>> >> >> Sent: 11 December 2009 16:09
>>> >> >> To: Biddiscombe, John A.
>>> >> >> Cc: paraview at paraview.org
>>> >> >> Subject: Re: [Paraview] Parallel Data Redistribution
>>> >> >>
>>> >> >> What is the data type? vtkRedistributePolyData and its subclasses
>>> >> >> do
>>> >> >> this for polydata. It can do load balancing (where you can specify
>>> >> >> a
>>> >> >> weight for each processor) as well.
>>> >> >>
>>> >> >> -berk
>>> >> >>
>>> >> >> On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A.
>>> <biddisco at cscs.ch>
>>> >> >> wrote:
>>> >> >> > I have a filter pipeline which reads N blocks from disk, this
>>> >> >> > works
>>> >> fine
>>> >> >> on N processors.
>>> >> >> >
>>> >> >> > I now wish to subdivide those N blocks (using a custom filter) to
>>> >> produce
>>> >> >> new data which will consist of M blocks - where M >> N.
>>> >> >> >
>>> >> >> > I wish to run the algorithm on M processors and have the piece
>>> >> information
>>> >> >> transformed between the two filters (reader -> splitter), so that
>>> blocks
>>> >> are
>>> >> >> distributed correctly. The reader will Read N blocks (leaving M-N
>>> >> processes
>>> >> >> unoccupied), but the filter which splits them up needs to output a
>>> >> different
>>> >> >> number of pieces and have the full M processes receiving data.
>>> >> >> >
>>> >> >> > I have a reasonably good idea of how to implement this, but I'm
>>> >> wondering
>>> >> >> if any filters already do something similar. I will of course take
>>> apart
>>> >> the
>>> >> >> D3 filter for ideas, but I don't need to do a parallel spatial
>>> >> decomposition
>>> >> >> since my blocks are already discrete - I just want to redistribute
>>> >> >> the
>>> >> >> blocks around and more importantly change the numbers of them
>>> >> >> between
>>> >> >> filters.
>>> >> >> >
>>> >> >> > If anyone can suggest examples which do this already, please do
>>> >> >> >
>>> >> >> > Thanks
>>> >> >> >
>>> >> >> > JB
>>> >> >> >
>>> >> >> > --
>>> >> >> > John Biddiscombe,                            email:biddisco @
>>> cscs.ch
>>> >> >> > http://www.cscs.ch/
>>> >> >> > CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
>>> 610.82.07
>>> >> >> > Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
>>> 610.82.82
>>> >> >> >
>>> >> >> >
>>> >> >> > _______________________________________________
>>> >> >> > Powered by www.kitware.com
>>> >> >> >
>>> >> >> > Visit other Kitware open-source projects at
>>> >> >> http://www.kitware.com/opensource/opensource.html
>>> >> >> >
>>> >> >> > Please keep messages on-topic and check the ParaView Wiki at:
>>> >> >> http://paraview.org/Wiki/ParaView
>>> >> >> >
>>> >> >> > Follow this link to subscribe/unsubscribe:
>>> >> >> > http://www.paraview.org/mailman/listinfo/paraview
>>> >> >> >
>>> >> >
>>> >
>>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at:
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>
>
>
>
>    ****      Kenneth Moreland
>     ***      Sandia National Laboratories
> ***********
> *** *** ***  email: kmorel at sandia.gov
> **  ***  **  phone: (505) 844-8919
>     ***      web:   http://www.cs.unm.edu/~kmorel
>
>




   ****      Kenneth Moreland
    ***      Sandia National Laboratories
***********
*** *** ***  email: kmorel at sandia.gov
**  ***  **  phone: (505) 844-8919
    ***      web:   http://www.cs.unm.edu/~kmorel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20091215/0bd82ec9/attachment-0001.htm>


More information about the ParaView mailing list