[Paraview] Xdmf Reader/Writer Changes

John Biddiscombe biddisco at cscs.ch
Wed Feb 13 09:25:38 EST 2008


Over the last week I've made some changes to the vtk-Xdmf Reader/Writer 
classes. I'd like to commit these changes, but they are potentially a 
cause for discussion so I'll post my lst of changes in the hope someone 
can give constructive/destructive criticism/objections ...apologies if I 
make silly typos in the following as I'm doing it from memory, the basic 
gist of it should be correct.

vtkXdmfWriter
-------------------
1) Added a SetTimeValue method. When set, the writer adds <Time 
value="..."/> to the generated file (grid). The principal reason for 
this addition is to allow
Loop
  read stuff
  SetTimeValue(...)
  writer->Write()
EndLoop

the generated Xdmf file now contains a collection of grids for the data 
you have been writing - see next entry for additional...

2) Added flag AppendGridsToDomain which should be set before the above 
looping procedure. This stops the Xmdf writer from creating a new *.xdmf 
file on each write but instead opens the existing one if present, reads 
the text into a string, appends the new grid and then closes it cleanly. 
Thus allowing multiple grid in the same file.
Also added an extra CollectionType ivar (use "TemporalCollection") and 
writer->CloseCollection() call which lets you write the tail of the file 
after a grid collection has been written.

3) Added a flag InputsArePieces - this allows me to read N blocks 
(typically 64 from 64 processors) one by one, call addinput as follows

writer->setInputsArePieces
writer->setAppendGridsToDomain
writer->SetFullGridSize(x,y,x)
writer->SetCollectionType("TemporalCollection")
loop over time steps
  writer->SetHeavyDataSetName(hdf5 file name generated from prefix + 
time step)
  writer->SetTimeValue()
  loop over blocks
    read block
    writer->AddInput()
  end block loop
  writer->write
end loop time
writer->CloseCollection()

On each write, the blocks are written into the same HDF5 file using the 
dataspace generated from the Extent of the data. I have only implemented 
this for vtkImageData so far. All blocks are written into one hdf5 file 
- and one hdf5 file is created per time step in this scenario.

4) Due to a problem with discovering Extents and wholeExtents (from a 
single piece of data) I posted to the list last week, I have added 
FullGridSize (Vector 3 int) set get macro so that when pieces are added, 
the whole extent is known and the dataspaces can be computed correctly. 
Without this, a piece with extent 0,11,0,11,0,11 does not know if it is 
part of a 0,64,0,64,0,64 or some other original size and the correct 
dataspace cannot be inferred. (assumes vtkImageData currently, c.f. 
discussion about extents of pv list)

5) Bug fix. There was an error in one of the Data copy routines from 
vtkDataArray to Xdmf array which munged variables with vector fields. 
Scalars were ok, but vectors wrong. Fixed it.

The (time/collection/block) additions above for the Writer have allowed 
me to generate one file per time step of contiguous image blocks from a 
simulation dataset which is split into many sub blocks - and overall 
make life much easier for me to process the data on a variety of 
configurations. (by which I mean reading 64 block on 8 processors is ok, 
but on 9, is not so good, and passing ghost info from one block to the 
next is painful, but when stored in a single file, getting the overlap 
cells is easy on any number of processors).

The writer changes should also work if we have parallel writes going to 
the same HDF5 file. We can re-use the data space generation so that all 
pieces are correctly written. The code is actually quite small, but is 
mesy because each time the file is closed and opened between grid 
writes, the dataspace structers are reset in the constructor of the Xdmf 
objects. etc etc...


vtkXdmfReader
-------------------
1) When the reader encounters an Xdmf file with Grid type = 
"TemporalCollection", it reads the time values from each grid and sets 
the TIME_STEPS and TIME_RANGE during UpdateInformation, internally it 
sets its output type to TemporalDataSet, but due to limitations of the 
way these dataset are handled by the executives, it actually exports a 
true vtkImage/vtkUnstructuredGrid/etc during RequestDataObject. This is 
actually the preferred was of handling it since ImageData exported 
downstream can be connected to image filters, whereas TemporalDataset 
can only be connected to composite inputs. The Actual Type of dataset 
created depends on the type of data in the xdmf file. Temporal 
collections of MultiGroup data are supported, but so far untested. The 
data tyope can change between timne steps if necessary, but is also 
untested.
1b) The description above does not preclude being able to generate 
multiple time steps simultaneously, if the UPDATE_TIME_STEPS has been 
set to a vector of values, then we arrange for the executive to place a 
TemporalDataset on the output which we can fill with data during 
RequestDataObject. I will need to double check the order of events 
between passes to ensure that this is possible and we may need to add a 
new Information key to tell the executive that multiple steps can be 
generated. Previously we assumed thata temporal dataset could be output, 
but in practise this confuses the pieplie as a dataobject is allowed to 
be either a dataset or a multigroup dataset and the logic in the 
executive gets confused if the dataobject can switch between all 3 of 
dataset/multigroup/temporal. If you read this and do not understand wht 
I'm talking aboiut, then you don't need to know and can ignore it, but 
Ken Moreland (for one) knows what I mean. Adding an extra key could be a 
solution to this problem. Discussion another day.

2) When Request Data occurs, if UPDATE_TIME_STEPS is set and time steps 
exist, the correct data set is returned for the time requested.
So far only discrete time steps can be stored/retrieved and so it is not 
yet possible to specify a grid that is statice for times between a<b and 
then changes to something else between c<d etc etc. But this will be a 
goal in the future.

3) For convenience, SetTimeStep will allow the user to loop over N time 
steps (in raw vtk style code) and output each successive dataset, but if 
UPDATE_TIME_STEPS is set by the executive (via the paraview GUI for 
example), it will override the manually set TimeStep.

4) Time collection of data in XDMF will all have the same grid name (eg 
zone1 at time 1,2,3,4,5) - this currently breaks the xdmf reader, so I 
renamed the grids with zone1#000 zone#001, the names go into a std::map 
structure, which would break if we used the same grid name each time. I 
suspec that this map could be modified to a list because in practice the 
grid name is not used anywhere else I don't think apart from the GUI 
enable/disable and we could put in another mechanism for checking. So 
currently I broke this but I doubt it matters.

There are probably some other changes, that I will remember after I've 
sent this.

Summary
------------
I have no idea how other people are using vtkXdmfReader/Writer so I 
really have no idea how much my changes will affect others. I doubt it 
will break much and I'll be happy to collaborate on integrating changes.

Time : The implementation I've made for writing multiple time steps out, 
and reading them back is working - but I'm not sure it is how the 
implementation was intended to be from the xdmf list postings made by 
Jerry a week or two back. My implementation allows me to basically write 
T grids one after the other with a single <time value ="blah"> set in 
each grid and no <time values = "list of values"> in the parent grid.

I'll hold off committing anything until I receive feedback. My next task 
will be to ensure that when I read data back from N processors, the Xdmf 
Reader does what it is supposed to do. Currently it does not AFAIK. 
(crashes when I try it, but not debugged anything yet).

JB

-- 
John Biddiscombe,                            email:biddisco @ cscs.ch
http://www.cscs.ch/about/BJohn.php
CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91) 610.82.82




More information about the ParaView mailing list