[vtkusers] Trying to understand data flow in the VTK pipeline

Thu Mar 4 14:03:47 EST 2010

VTK is not as smart as you are hoping it is. It can not in general and
without assistance figure out what parts of the input is needed for a
given request with a very tight bounds.

VTK does support arbitrary streaming however, which is the basis for
data parallelism (as exemplified by ParaView). Here the downstream
ends tells the rest of the pipeline what fraction of the data it
needs, and only that fraction is processed. So, the solution to big
data is to add more cluster nodes (and thus more RAM), and divide the
data more finely until no one node exceeds memory. But note that even
in this case the entire dataset gets processed by somewhere.

Kitware and LANL have been working on improvements to streaming that
improve upon that. Some filters can quickly determine if a particular
piece is needed at all or not (for example a piece outside the view
frustum is not), and avoid processing (and thus IO for the input) of
pieces that do not matter. If the number of pieces is high, you can
get pretty tight bounds and avoid lots of IO and processing in many
cases. See TestPriorityStreaming for a simple demonstration, or the
Streaming and Adaptive ParaView experimental applications that
exercise it.

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109

On Tue, Feb 23, 2010 at 7:42 PM, Ben Medina <ben.medina at gmail.com> wrote:
> Hello all,
>
> I'm trying to understand how data flows in the VTK pipeline, in
> particular with regard to file readers. My concern is that the
> pipeline seems to force the reader to load more data from the file
> than is necessary to create the plot.
>
> Here's my reference pipeline for this discussion:
>
> Reader -> vtkContourFilter -> vtkPolyDataMapper -> vtkActor
>
> Where "Reader" is some type of file reader (vtkDataReader,
> vtkFLUENTReader, etc.).
>
> In my mind, this is how I hoped VTK would work:
> 1. An Update() request is sent upstream from the actor.
> 2. vtkContourFilter knows that it needs a certain amount of
> information from the Reader (e.g. points and a scalar), so it requests
> that information from the Reader.
> 3. The Reader loads just that data from the file. There may be much
> more scalar and vector data in the file, but since no one has
> requested it, it does not get loaded into memory.
> 4. vtkContourFilter completes, and the mapper and actor do their part
> to get the plot rendered.
>
> However, this does not seem to be the case. It seems that there is no
> way to communicate to the Reader that only particular data is required
> (and most readers seem to ignore the first parameter to RequestData
> that I hoped could specify this). And the Readers seem to have no way
> to "publish" what data is available, other than loading data into
> memory in a vtkDataObject.
>
> If this is the case, then what is the solution when VTK is used on
> systems that can't load an entire data file into memory (but could
> load enough of a subset of scalar data to create a visualization)?
>
> Thanks,
> Ben
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the VTK FAQ at: http://www.vtk.org/Wiki/VTK_FAQ
>
> Follow this link to subscribe/unsubscribe:
> http://www.vtk.org/mailman/listinfo/vtkusers
>