VTK/Composite Data Redesign
Composite dataset re-architecture
Current design
Issues with the current design
- Most functionality is based on vtkMultiGroupDataSet instead of vtkCompositeDataSet. For example, most algorithms (and the executives) use vtkMultiGroupDataSet API to iterate. This makes it impossible to add new sub-classes of vtkCompositeDataSet without writing new executives.
- The concept of sub-block is confusing. vtkMultiGroupDataSet stores a vector of vectors of datasets. When this concept is mapped to the multi-block (or temporal) datasets, each block ends up having multiple sub-blocks. Furthermore, the convention that these sub-block ids map to the process ids is very confusing.
- Algorithms that want to pass blanking have to downcast to vtkHierarchicalBoxDataSet and copy blanking explicitely.
- vtkCompositeDataPipeline is a mess.
Suggested design
- Get rid of vtkMultiGroupDataSet. Any code shared between subclasses of vtkCompositeDataPipeline can be shared using helper implementation objects.
- Improve the iterators so that it is not necessary to use vtkMultiGroupDataSet API to iterate over blocks.
- Add a vtkMultiPieceDataSet class that can be used to group multiple pieces together. Example: when loading a dataset with multiple partitions on 1 processor, vtkMultiPieceDataSet can be used instead of appending datasets together. vtkMultiPieceDataSet would have additional meta-data about things like whole extent for structured datasets.
- Clean up vtkCompositeDataPipeline.
- Improve ghost level support for composite datasets.
Iterators
In the current architecture, the most common thing to do is the following:
unsigned int numGroups = mbInput->GetNumberOfGroups(); output->SetNumberOfGroups(numGroups); for (unsigned int groupId=0; groupId<numGroups; groupId++) { unsigned int numBlocks = mbInput->GetNumberOfDataSets(groupId); output->SetNumberOfDataSets(groupId, numBlocks); for (unsigned int blockId=0; blockId<numBlocks; blockId++) { vtkDataObject* block = mbInput->GetDataSet(groupId, blockId); // do something with block to get an outBlock output->SetDataSet(groupId, blockId, outBlock); } }
As mentioned above, problem with this approach is that it assumes that the composite dataset is a vtkMultiGroupDataSet. With the appropriate changes to the composite data iterators and composite datasets, the code above can be rewritten as
output->CopyStructure(mbInput); vtkCompositeDataIterator* iter = mbInput->NewIterator(); iter->GoToFirstItem(); while (!iter->IsDoneWithTraversal()) { vtkDataObjects* block = iter->GetCurrentDataObject(); // Note that the iterator will only visit the leaf nodes by default. // do something with block to get outBlock // copy the meta-data outBlock->CopyInformation(block); output->SetDataSet(iter, outBlock); iter->GoToNextItem(); } iter->Delete(); append->Update();
The implementation above requires two additional methods: CopyStructure() and SetDataSet(iter, dataObject). The task of CopyStructure() is to create a tree structure on the output composite data object identical to that of the input. In the case of hierarchical datasets, this means same number of levels and same number of datasets on all levels. In the case of multi-block datasets, this means an identical tree. This may look like this:
After CopyStructure(), the output will have the same hierarchy except all vtkPolyData leaf nodes will be replaced by null pointers. CopyStructure() should also copy things like refinement ratios etc. This should also include all of the meta-data (information) of all non-leaf nodes. We are likely to use things like names for groups etc. when dealing with multi-block datasets.
Note on vtkHierarchicalBoxDataSet: Currently, a vtkHierarchicalBoxDataSet is converted to a vtkMultiGroupDataSet when it is processed by a simple algorithm or a vtkMultiGroupDataAlgorithm. We should think about this. Maybe when a vtkHierarchicalBoxDataSet is processed by a vtkDataSetAlgorithm, the output should be vtkHierarchicalBoxDataSet too?
The task of SetDataSet(iter, dataObject) is to add a leaf dataset at the exact same position that the iterator is pointing at on the input. This will require changing iterators such that they are keeping track of their position in a composite dataset by some sort of index. The easiest way of doing this is to use two integers for hierarchical datasets (level, index) and a vector of integers of length equal to the current tree level for the multi-block datasets.
vtkMultiPieceDataSet
A multi-piece dataset groups multiple data pieces together. For example, say that a simulation broke a volume into 16 piece so that each piece can be processed with 1 process in parallel. We want to load this volume in a visualization cluster of 4 nodes. Each node will get 4 pieces, not necessarily forming a whole rectangular piece. In this case, it is not possible to append the 4 pieces together into a vtkImageData. In this case, these 4 pieces can be collected together using a vtkMultiPieceDataSet. Although it is possible to use a vtkMultiBlockDataSet for this purpose, a vtkMultiPieceDataSet makes it clear that these are pieces of one whole dataset that are collected together. Given this information, applications like paraview can treat these in a special way. For example, meta-data about the whole extent of the dataset can be displayed, neighborhood information can be obtained, ghost levels can be generated etc etc.
Note: The use of vtkMultiPieceDataSet is not yet very clear to me but I think it will be necessary.
vtkCompositeDataPipeline cleanup
There will be a list of changes to vtkCompositeDataPipeline here. The executive is a mess right now due to all the use cases it supports and because it grew organically. We need to take a step back and clean it up, possibly rewriting portions of it.
Ghost level support
Currently, ghost level requests are passed up the pipeline but they are pretty much ignored by the pipeline. This will not do, specially when we improve D3 to support multi-block datasets. Getting unstructured and dataset algorithms to work with ghost levels is pretty straightforward. Getting structured data filters working is a little trickier.
Note: Realistically, readers do not produce more than 1 ghost level. We may want to take this into account.
Implementation
The implementation is based on the above design with some notable differences:
- vtkHierarchicalDataSet is deprecated. Due to lack of use-cases to create a AMR-like hierarchy with unstructured data, this class was deprecated. Applications can implemented same behavior using vtkMultiBlockDataSet. vtkMultiBlockDataSet provides for meta-data associated with each node in the tree, thus making it possible for applications to attach level information with blocks.
Class Hierarchy: Class hierarchy for current implementation of composite datasets
vtkCompositeDataSet
vtkCompositeDataSet is the abstract superclass for all composite datasets. It implements a full tree structure in which nodes can be datasets or other composite datasets. However the API to access the tree directly is protected. Each subclass can build and maintain this tree as per its requirements eg. vtkHierarchicalBoxDataSet builds 1 level deep trees with the 1st level nodes being vtkMultiBlockDataSet instances which correspond to a level in the hierarchical dataset. One can obtain a vtkCompositeDataIterator instance from the vtkCompositeDataSet to iterate over the tree structure. vtkCompositeDataSet provides public API to get/set dataobjects and metadata using the iterator. Important API is listed below:
// Description: // Return a new iterator (the iterator has to be deleted by user). virtual vtkCompositeDataIterator* NewIterator(); // Description: // Copies the tree structure from the input. All pointers to non-composite // data objects are intialized to NULL. This also shallow copies the meta data // associated with all the nodes. virtual void CopyStructure(vtkCompositeDataSet* input); // Description: // Sets the data set at the location pointed by the iterator. // The iterator does not need to be iterating over this dataset itself. It can // be any composite datasite with similar structure (achieved by using // CopyStructure). virtual void SetDataSet(vtkCompositeDataIterator* iter, vtkDataObject* dataObj); // Description: // Returns the dataset located at the positiong pointed by the iterator. // The iterator does not need to be iterating over this dataset itself. It can // be an iterator for composite dataset with similar structure (achieved by // using CopyStructure). virtual vtkDataObject* GetDataSet(vtkCompositeDataIterator* iter); // Description: // Returns the meta-data associated with the position pointed by the iterator. // This will create a new vtkInformation object if none already exists. Use // HasMetaData to avoid creating the vtkInformation object unnecessarily. // The iterator does not need to be iterating over this dataset itself. It can // be an iterator for composite dataset with similar structure (achieved by // using CopyStructure). virtual vtkInformation* GetMetaData(vtkCompositeDataIterator* iter); // Description: // Returns if any meta-data associated with the position pointed by the iterator. // The iterator does not need to be iterating over this dataset itself. It can // be an iterator for composite dataset with similar structure (achieved by // using CopyStructure). virtual int HasMetaData(vtkCompositeDataIterator* iter); // Description: // Shallow and Deep copy. virtual void ShallowCopy(vtkDataObject *src); virtual void DeepCopy(vtkDataObject *src);
vtkTemporalDataSet
vtkTemporalDataSet is used to hold multiple timesteps.
// Description: // Set the number of time steps in theis dataset void SetNumberOfTimeSteps(unsigned int numLevels); // Description: // Returns the number of time steps. unsigned int GetNumberOfTimeSteps(); // Description: // Set a data object as a timestep. Cannot be vtkTemporalDataSet. void SetTimeStep(unsigned int timestep, vtkDataObject* dobj); // Description: // Get a timestep. vtkDataObject* GetTimeStep(unsigned int timestep); // Description: // Get timestep meta-data. vtkInformation* GetMetaData(unsigned int timestep); // Description: // Returns if timestep meta-data is present. int HasMetaData(unsigned int timestep);
vtkMultiBlockDataSet
vtkMultiBlockDataSet is a vtkCompositeDataSet in which the child nodes can either be vtkDataSet subclasses or vtkMultiBlockDataSet. This is used when full trees are required. Meta-data can be associated with leaf nodes as well as non-leaf nodes in the tree.
// Description: // Set the number of blocks. This will cause allocation if the new number of // blocks is greater than the current size. All new blocks are initialized to // null. void SetNumberOfBlocks(unsigned int numBlocks); // Description: // Returns the number of blocks. unsigned int GetNumberOfBlocks(); // Description: // Returns the block at the given index. It is recommended that one uses the // iterators to iterate over composite datasets rather than using this API. vtkDataObject* GetBlock(unsigned int blockno); // Description: // Sets the data object as the given block. The total number of blocks will // be resized to fit the requested block no. The only vtkCompositeDataSet subclass // that can be added as a block is a vtkMultiBlockDataSet, // an error is raised otherwise. void SetBlock(unsigned int blockno, vtkDataObject* block); // Description: // Returns true if meta-data is available for a given block. int HasMetaData(unsigned int blockno); // Description: // Returns the meta-data for the block. If none is already present, a new // vtkInformation object will be allocated. Use HasMetaData to avoid // allocating vtkInformation objects. vtkInformation* GetMetaData(unsigned int blockno);
vtkHierarchicalBoxDataSet
vtkHiererchicalBoxDataSet is a hierarchical dataset of Uniform grids. It is designed for AMR (Adaptive mesh refinement) dataset. The structure consists of levels, with each level containing datasets. The dataset type is restricted to vtkUniformGrid. Each dataset has an associated vtkAMRBox that represents it's region (similar to extent) in space. Internally, each level in a vtkHierarchicalBoxDataSet is nothing but a vtkMultiPieceDataSet.
// Description: // Set the number of refinement levels. This call might cause // allocation if the new number of levels is larger than the // current one. void SetNumberOfLevels(unsigned int numLevels); // Description: // Returns the number of levels. unsigned int GetNumberOfLevels(); // Description: // Set the number of data set at a given level. void SetNumberOfDataSets(unsigned int level, unsigned int numdatasets); // Description: // Returns the number of data sets available at any level. unsigned int GetNumberOfDataSets(unsigned int level); // Description: // Set the dataset pointer for a given node. This will resize the number of // levels and the number of datasets in the level to fit level, id requested. void SetDataSet(unsigned int level, unsigned int id, vtkAMRBox& box, vtkUniformGrid* dataSet); // Description: // Get a dataset given a level and an id. vtkUniformGrid* GetDataSet(unsigned int level, unsigned int id, vtkAMRBox& box); // Description: // Get meta-data associated with a level. This may allocate a new // vtkInformation object if none is already present. Use HasLevelMetaData to // avoid unnecessary allocations. vtkInformation* GetLevelMetaData(unsigned int level); // Description: // Returns if meta-data exists for a given level. int HasLevelMetaData(unsigned int level); // Description: // Get meta-data associated with a dataset. This may allocate a new // vtkInformation object if none is already present. Use HasMetaData to // avoid unnecessary allocations. vtkInformation* GetMetaData(unsigned int level, unsigned int index); // Description: // Returns if meta-data exists for a given dataset under a given level. int HasMetaData(unsigned int level, unsigned int index); // Description: // Sets the refinement of a given level. The spacing at level // level+1 is defined as spacing(level+1) = spacing(level)/refRatio(level). // Note that currently, this is not enforced by this class however // some algorithms might not function properly if the spacing in // the blocks (vtkUniformGrid) does not match the one described // by the refinement ratio. void SetRefinementRatio(unsigned int level, int refRatio); // Description: // Returns the refinement of a given level. int GetRefinementRatio(unsigned int level); // Description: // Returns the AMR box for the location pointer by the iterator. vtkAMRBox GetAMRBox(vtkCompositeDataIterator* iter); // Description: // Returns the refinement ratio for the position pointed by the iterator. int GetRefinementRatio(vtkCompositeDataIterator* iter);
vtkCompositeDataIterator
vtkCompositeDataIterator is used to iterate over composite datasets.
// Description: // Set the composite dataset this iterator is iterating over. // Must be set before traversal begins. virtual void SetDataSet(vtkCompositeDataSet* ds); vtkGetObjectMacro(DataSet, vtkCompositeDataSet); // Description: // Begin iterating over the composite dataset structure. virtual void InitTraversal(); // Description: // Begin iterating over the composite dataset structure in reverse order. virtual void InitReverseTraversal(); // Description: // Move the iterator to the beginning of the collection. virtual void GoToFirstItem(); // Description: // Move the iterator to the next item in the collection. virtual void GoToNextItem(); // Description: // Test whether the iterator is currently pointing to a valid item. Returns 1 // for yes, and 0 for no. virtual int IsDoneWithTraversal(); // Description: // Returns the current item. Valid only when IsDoneWithTraversal() returns 0. virtual vtkDataObject* GetCurrentDataObject(); // Description: // Returns the meta-data associated with the current item. This will allocate // a new vtkInformation object is none is already present. Use // HasCurrentMetaData to avoid unnecessary creation of vtkInformation objects. virtual vtkInformation* GetCurrentMetaData(); // Description: // Returns if the a meta-data information object is present for the current // item. Return 1 on success, 0 otherwise. virtual int HasCurrentMetaData(); // Description: // If VisitOnlyLeaves is true, the iterator will only visit nodes // (sub-datasets) that are not composite. If it encounters a composite // data set, it will automatically traverse that composite dataset until // it finds non-composite datasets (see also TraverseSubTree). // With this options, it is possible to // visit all non-composite datasets in tree of composite datasets // (composite of composite of composite for example :-) ) If // VisitOnlyLeaves is false, GetCurrentDataObject() may return // vtkCompositeDataSet. By default, VisitOnlyLeaves is 1. vtkSetMacro(VisitOnlyLeaves, int); vtkGetMacro(VisitOnlyLeaves, int); vtkBooleanMacro(VisitOnlyLeaves, int); // Description: // If TraverseSubTree is set to true, the iterator will visit the entire tree // structure, otherwise it only visits the first level children. Set to 1 by // default. vtkSetMacro(TraverseSubTree, int); vtkGetMacro(TraverseSubTree, int); vtkBooleanMacro(TraverseSubTree, int);
Changes from VTK 5.0
vtkCompositeDataPipeline
This executive is used to iterative execute a non-composite data aware filter over all the leaves in a composite dataset. In VTK 5.0, the vtkHierarchicalBoxDataSet was always converted to a vtkMultiBlockDataSet when a non-composite aware filter was present in the pipeline. This is no longer the case. vtkCompositePipeline now verifies if the non-composite aware algorithm can produce vtkUniformGrid given a vtkUniformGrid as an input. If so, for a vtkHierarchicalBoxDataSet input, the output is a vtkHierarchicalBoxDataSet otherwise it is a vtkMultiBlockDataSet. Even when the vtkHierarchicalBoxDataSet is converted to a vtkMutliBlockDataSet the composite data tree structure is preserved in other words: since vtkHierarchicalBoxDataSet has vtkMutliPieceDataSet instances for each level, the converted vtkMultiBlockDataSet will also have vtkMutliPieceDataSet instances as the child blocks of the root node.
Class Names
A few class names have changed, a few others are no longer available. This table lists the old class name and an equivalent class in the new design.
Old Class | Equivalent Class |
---|---|
vtkHierarchicalDataInformation | * |
vtkHierarchicalDataIterator | vtkCompositeIterator |
vtkHierarchicalDataSet | * |
vtkHierarchicalDataSetAlgorithm | * |
vtkMultiGroupDataInformation | * |
vtkMultiGroupDataIterator | vtkCompositeIterator |
vtkMultiGroupDataSet | vtkCompositeDataSet |
vtkMultiGroupDataSetAlgorithm | vtkCompositeAlgorithm |
vtkHierarchicalDataGroupFilter | vtkMultiBlockDataGroupFilter |
vtkMultiGroupDataExtractDataSets | vtkExtractDataSets |
vtkMultiGroupDataExtractGroup | vtkExtractBlock, vtkExtractLevel |
vtkMultiGroupDataGeometryFilter | vtkCompositeDataGeometryFilter |
vtkMultiGroupDataGroupFilter | vtkMultiBlockDataGroupFilter |
vtkMultiGroupDataGroupIdScalars | vtkBlockIdScalars, vtkLevelIdScalars |
vtkMultiGroupProbeFilter | vtkCompositeDataProbeFilter |
vtkXMLHierarchicalDataReader | * |
vtkXMLMultiGroupDataReader | vtkXMLCompositeDataReader, |
vtkXMLHierarchicalBoxDataReader, | |
vtkXMLMultiBlockDataReader | |
vtkXMLMultiGroupDataWriter | vtkXMLCompositeDataWriter, |
vtkXMLHierarchicalBoxDataWriter, | |
vtkXMLMultiBlockDataWriter | |
vtkMultiGroupDataExtractPiece | vtkExtractPiece |
vtkXMLPMultiGroupDataWriter | vtkXMLPMultiBlockDataWriter, |
vtkXMLPHierarchicalBoxDataWriter | |
vtkMultiGroupPolyDataMapper | vtkCompositePolyDataMapper |