[Paraview] Xdmf Reader/Writer Changes
John Biddiscombe
biddisco at cscs.ch
Wed Feb 13 09:25:38 EST 2008
Over the last week I've made some changes to the vtk-Xdmf Reader/Writer
classes. I'd like to commit these changes, but they are potentially a
cause for discussion so I'll post my lst of changes in the hope someone
can give constructive/destructive criticism/objections ...apologies if I
make silly typos in the following as I'm doing it from memory, the basic
gist of it should be correct.
vtkXdmfWriter
-------------------
1) Added a SetTimeValue method. When set, the writer adds <Time
value="..."/> to the generated file (grid). The principal reason for
this addition is to allow
Loop
read stuff
SetTimeValue(...)
writer->Write()
EndLoop
the generated Xdmf file now contains a collection of grids for the data
you have been writing - see next entry for additional...
2) Added flag AppendGridsToDomain which should be set before the above
looping procedure. This stops the Xmdf writer from creating a new *.xdmf
file on each write but instead opens the existing one if present, reads
the text into a string, appends the new grid and then closes it cleanly.
Thus allowing multiple grid in the same file.
Also added an extra CollectionType ivar (use "TemporalCollection") and
writer->CloseCollection() call which lets you write the tail of the file
after a grid collection has been written.
3) Added a flag InputsArePieces - this allows me to read N blocks
(typically 64 from 64 processors) one by one, call addinput as follows
writer->setInputsArePieces
writer->setAppendGridsToDomain
writer->SetFullGridSize(x,y,x)
writer->SetCollectionType("TemporalCollection")
loop over time steps
writer->SetHeavyDataSetName(hdf5 file name generated from prefix +
time step)
writer->SetTimeValue()
loop over blocks
read block
writer->AddInput()
end block loop
writer->write
end loop time
writer->CloseCollection()
On each write, the blocks are written into the same HDF5 file using the
dataspace generated from the Extent of the data. I have only implemented
this for vtkImageData so far. All blocks are written into one hdf5 file
- and one hdf5 file is created per time step in this scenario.
4) Due to a problem with discovering Extents and wholeExtents (from a
single piece of data) I posted to the list last week, I have added
FullGridSize (Vector 3 int) set get macro so that when pieces are added,
the whole extent is known and the dataspaces can be computed correctly.
Without this, a piece with extent 0,11,0,11,0,11 does not know if it is
part of a 0,64,0,64,0,64 or some other original size and the correct
dataspace cannot be inferred. (assumes vtkImageData currently, c.f.
discussion about extents of pv list)
5) Bug fix. There was an error in one of the Data copy routines from
vtkDataArray to Xdmf array which munged variables with vector fields.
Scalars were ok, but vectors wrong. Fixed it.
The (time/collection/block) additions above for the Writer have allowed
me to generate one file per time step of contiguous image blocks from a
simulation dataset which is split into many sub blocks - and overall
make life much easier for me to process the data on a variety of
configurations. (by which I mean reading 64 block on 8 processors is ok,
but on 9, is not so good, and passing ghost info from one block to the
next is painful, but when stored in a single file, getting the overlap
cells is easy on any number of processors).
The writer changes should also work if we have parallel writes going to
the same HDF5 file. We can re-use the data space generation so that all
pieces are correctly written. The code is actually quite small, but is
mesy because each time the file is closed and opened between grid
writes, the dataspace structers are reset in the constructor of the Xdmf
objects. etc etc...
vtkXdmfReader
-------------------
1) When the reader encounters an Xdmf file with Grid type =
"TemporalCollection", it reads the time values from each grid and sets
the TIME_STEPS and TIME_RANGE during UpdateInformation, internally it
sets its output type to TemporalDataSet, but due to limitations of the
way these dataset are handled by the executives, it actually exports a
true vtkImage/vtkUnstructuredGrid/etc during RequestDataObject. This is
actually the preferred was of handling it since ImageData exported
downstream can be connected to image filters, whereas TemporalDataset
can only be connected to composite inputs. The Actual Type of dataset
created depends on the type of data in the xdmf file. Temporal
collections of MultiGroup data are supported, but so far untested. The
data tyope can change between timne steps if necessary, but is also
untested.
1b) The description above does not preclude being able to generate
multiple time steps simultaneously, if the UPDATE_TIME_STEPS has been
set to a vector of values, then we arrange for the executive to place a
TemporalDataset on the output which we can fill with data during
RequestDataObject. I will need to double check the order of events
between passes to ensure that this is possible and we may need to add a
new Information key to tell the executive that multiple steps can be
generated. Previously we assumed thata temporal dataset could be output,
but in practise this confuses the pieplie as a dataobject is allowed to
be either a dataset or a multigroup dataset and the logic in the
executive gets confused if the dataobject can switch between all 3 of
dataset/multigroup/temporal. If you read this and do not understand wht
I'm talking aboiut, then you don't need to know and can ignore it, but
Ken Moreland (for one) knows what I mean. Adding an extra key could be a
solution to this problem. Discussion another day.
2) When Request Data occurs, if UPDATE_TIME_STEPS is set and time steps
exist, the correct data set is returned for the time requested.
So far only discrete time steps can be stored/retrieved and so it is not
yet possible to specify a grid that is statice for times between a<b and
then changes to something else between c<d etc etc. But this will be a
goal in the future.
3) For convenience, SetTimeStep will allow the user to loop over N time
steps (in raw vtk style code) and output each successive dataset, but if
UPDATE_TIME_STEPS is set by the executive (via the paraview GUI for
example), it will override the manually set TimeStep.
4) Time collection of data in XDMF will all have the same grid name (eg
zone1 at time 1,2,3,4,5) - this currently breaks the xdmf reader, so I
renamed the grids with zone1#000 zone#001, the names go into a std::map
structure, which would break if we used the same grid name each time. I
suspec that this map could be modified to a list because in practice the
grid name is not used anywhere else I don't think apart from the GUI
enable/disable and we could put in another mechanism for checking. So
currently I broke this but I doubt it matters.
There are probably some other changes, that I will remember after I've
sent this.
Summary
------------
I have no idea how other people are using vtkXdmfReader/Writer so I
really have no idea how much my changes will affect others. I doubt it
will break much and I'll be happy to collaborate on integrating changes.
Time : The implementation I've made for writing multiple time steps out,
and reading them back is working - but I'm not sure it is how the
implementation was intended to be from the xdmf list postings made by
Jerry a week or two back. My implementation allows me to basically write
T grids one after the other with a single <time value ="blah"> set in
each grid and no <time values = "list of values"> in the parent grid.
I'll hold off committing anything until I receive feedback. My next task
will be to ensure that when I read data back from N processors, the Xdmf
Reader does what it is supposed to do. Currently it does not AFAIK.
(crashes when I try it, but not debugged anything yet).
JB
--
John Biddiscombe, email:biddisco @ cscs.ch
http://www.cscs.ch/about/BJohn.php
CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82
More information about the ParaView
mailing list