[Paraview] Re: [vtk-developers] Xdmf Reader/Writer Changes
Jerry Clarke
clarke at arl.army.mil
Wed Feb 13 12:52:50 EST 2008
John,
I'm still going thru this to understand the implications. But I think
that some of this needs to be reflected in the XDMF library as opposed
to just being in the reader/writer. For example, if there is a new
GridType="TemporalCollection" it should be in XdmfGrid.[h cxx] so that
it propagates to all other SW.
Maybe instead of GridType="TemporalCollection" we have
GridType="Collection" and then CollectionType="Temporal | Spatial".
If you needed both you could have a temporal collection of spatial
collections. This would need support in XdmfGrid but be easy to add.
Jerry
John Biddiscombe wrote:
> Over the last week I've made some changes to the vtk-Xdmf Reader/Writer
> classes. I'd like to commit these changes, but they are potentially a
> cause for discussion so I'll post my lst of changes in the hope someone
> can give constructive/destructive criticism/objections ...apologies if I
> make silly typos in the following as I'm doing it from memory, the basic
> gist of it should be correct.
>
> vtkXdmfWriter
> -------------------
> 1) Added a SetTimeValue method. When set, the writer adds <Time
> value="..."/> to the generated file (grid). The principal reason for
> this addition is to allow
> Loop
> read stuff
> SetTimeValue(...)
> writer->Write()
> EndLoop
>
> the generated Xdmf file now contains a collection of grids for the data
> you have been writing - see next entry for additional...
>
> 2) Added flag AppendGridsToDomain which should be set before the above
> looping procedure. This stops the Xmdf writer from creating a new *.xdmf
> file on each write but instead opens the existing one if present, reads
> the text into a string, appends the new grid and then closes it cleanly.
> Thus allowing multiple grid in the same file.
> Also added an extra CollectionType ivar (use "TemporalCollection") and
> writer->CloseCollection() call which lets you write the tail of the file
> after a grid collection has been written.
>
> 3) Added a flag InputsArePieces - this allows me to read N blocks
> (typically 64 from 64 processors) one by one, call addinput as follows
>
> writer->setInputsArePieces
> writer->setAppendGridsToDomain
> writer->SetFullGridSize(x,y,x)
> writer->SetCollectionType("TemporalCollection")
> loop over time steps
> writer->SetHeavyDataSetName(hdf5 file name generated from prefix + time
> step)
> writer->SetTimeValue()
> loop over blocks
> read block
> writer->AddInput()
> end block loop
> writer->write
> end loop time
> writer->CloseCollection()
>
> On each write, the blocks are written into the same HDF5 file using the
> dataspace generated from the Extent of the data. I have only implemented
> this for vtkImageData so far. All blocks are written into one hdf5 file
> - and one hdf5 file is created per time step in this scenario.
>
> 4) Due to a problem with discovering Extents and wholeExtents (from a
> single piece of data) I posted to the list last week, I have added
> FullGridSize (Vector 3 int) set get macro so that when pieces are added,
> the whole extent is known and the dataspaces can be computed correctly.
> Without this, a piece with extent 0,11,0,11,0,11 does not know if it is
> part of a 0,64,0,64,0,64 or some other original size and the correct
> dataspace cannot be inferred. (assumes vtkImageData currently, c.f.
> discussion about extents of pv list)
>
> 5) Bug fix. There was an error in one of the Data copy routines from
> vtkDataArray to Xdmf array which munged variables with vector fields.
> Scalars were ok, but vectors wrong. Fixed it.
>
> The (time/collection/block) additions above for the Writer have allowed
> me to generate one file per time step of contiguous image blocks from a
> simulation dataset which is split into many sub blocks - and overall
> make life much easier for me to process the data on a variety of
> configurations. (by which I mean reading 64 block on 8 processors is ok,
> but on 9, is not so good, and passing ghost info from one block to the
> next is painful, but when stored in a single file, getting the overlap
> cells is easy on any number of processors).
>
> The writer changes should also work if we have parallel writes going to
> the same HDF5 file. We can re-use the data space generation so that all
> pieces are correctly written. The code is actually quite small, but is
> mesy because each time the file is closed and opened between grid
> writes, the dataspace structers are reset in the constructor of the Xdmf
> objects. etc etc...
>
>
> vtkXdmfReader
> -------------------
> 1) When the reader encounters an Xdmf file with Grid type =
> "TemporalCollection", it reads the time values from each grid and sets
> the TIME_STEPS and TIME_RANGE during UpdateInformation, internally it
> sets its output type to TemporalDataSet, but due to limitations of the
> way these dataset are handled by the executives, it actually exports a
> true vtkImage/vtkUnstructuredGrid/etc during RequestDataObject. This is
> actually the preferred was of handling it since ImageData exported
> downstream can be connected to image filters, whereas TemporalDataset
> can only be connected to composite inputs. The Actual Type of dataset
> created depends on the type of data in the xdmf file. Temporal
> collections of MultiGroup data are supported, but so far untested. The
> data tyope can change between timne steps if necessary, but is also
> untested.
> 1b) The description above does not preclude being able to generate
> multiple time steps simultaneously, if the UPDATE_TIME_STEPS has been
> set to a vector of values, then we arrange for the executive to place a
> TemporalDataset on the output which we can fill with data during
> RequestDataObject. I will need to double check the order of events
> between passes to ensure that this is possible and we may need to add a
> new Information key to tell the executive that multiple steps can be
> generated. Previously we assumed thata temporal dataset could be output,
> but in practise this confuses the pieplie as a dataobject is allowed to
> be either a dataset or a multigroup dataset and the logic in the
> executive gets confused if the dataobject can switch between all 3 of
> dataset/multigroup/temporal. If you read this and do not understand wht
> I'm talking aboiut, then you don't need to know and can ignore it, but
> Ken Moreland (for one) knows what I mean. Adding an extra key could be a
> solution to this problem. Discussion another day.
>
> 2) When Request Data occurs, if UPDATE_TIME_STEPS is set and time steps
> exist, the correct data set is returned for the time requested.
> So far only discrete time steps can be stored/retrieved and so it is not
> yet possible to specify a grid that is statice for times between a<b and
> then changes to something else between c<d etc etc. But this will be a
> goal in the future.
>
> 3) For convenience, SetTimeStep will allow the user to loop over N time
> steps (in raw vtk style code) and output each successive dataset, but if
> UPDATE_TIME_STEPS is set by the executive (via the paraview GUI for
> example), it will override the manually set TimeStep.
>
> 4) Time collection of data in XDMF will all have the same grid name (eg
> zone1 at time 1,2,3,4,5) - this currently breaks the xdmf reader, so I
> renamed the grids with zone1#000 zone#001, the names go into a std::map
> structure, which would break if we used the same grid name each time. I
> suspec that this map could be modified to a list because in practice the
> grid name is not used anywhere else I don't think apart from the GUI
> enable/disable and we could put in another mechanism for checking. So
> currently I broke this but I doubt it matters.
>
> There are probably some other changes, that I will remember after I've
> sent this.
>
> Summary
> ------------
> I have no idea how other people are using vtkXdmfReader/Writer so I
> really have no idea how much my changes will affect others. I doubt it
> will break much and I'll be happy to collaborate on integrating changes.
>
> Time : The implementation I've made for writing multiple time steps out,
> and reading them back is working - but I'm not sure it is how the
> implementation was intended to be from the xdmf list postings made by
> Jerry a week or two back. My implementation allows me to basically write
> T grids one after the other with a single <time value ="blah"> set in
> each grid and no <time values = "list of values"> in the parent grid.
>
> I'll hold off committing anything until I receive feedback. My next task
> will be to ensure that when I read data back from N processors, the Xdmf
> Reader does what it is supposed to do. Currently it does not AFAIK.
> (crashes when I try it, but not debugged anything yet).
>
> JB
>
More information about the ParaView
mailing list