[Paraview] Re: [vtk-developers] Xdmf Reader/Writer Changes

Wed Feb 13 12:52:50 EST 2008

John,

I'm still going thru this to understand the implications. But I think
that some of this needs to be reflected in the XDMF library as opposed
to just being in the reader/writer. For example, if there is a new
GridType="TemporalCollection" it should be in XdmfGrid.[h cxx] so that
it propagates to all other SW.

Maybe instead of GridType="TemporalCollection" we have 
GridType="Collection" and then CollectionType="Temporal | Spatial".
If you needed both you could have a temporal collection of spatial
collections. This would need support in XdmfGrid but be easy to add.

Jerry

John Biddiscombe wrote:
> Over the last week I've made some changes to the vtk-Xdmf Reader/Writer 
> classes. I'd like to commit these changes, but they are potentially a 
> cause for discussion so I'll post my lst of changes in the hope someone 
> can give constructive/destructive criticism/objections ...apologies if I 
> make silly typos in the following as I'm doing it from memory, the basic 
> gist of it should be correct.
> 
> vtkXdmfWriter
> -------------------
> 1) Added a SetTimeValue method. When set, the writer adds <Time 
> value="..."/> to the generated file (grid). The principal reason for 
> this addition is to allow
> Loop
>  read stuff
>  SetTimeValue(...)
>  writer->Write()
> EndLoop
> 
> the generated Xdmf file now contains a collection of grids for the data 
> you have been writing - see next entry for additional...
> 
> 2) Added flag AppendGridsToDomain which should be set before the above 
> looping procedure. This stops the Xmdf writer from creating a new *.xdmf 
> file on each write but instead opens the existing one if present, reads 
> the text into a string, appends the new grid and then closes it cleanly. 
> Thus allowing multiple grid in the same file.
> Also added an extra CollectionType ivar (use "TemporalCollection") and 
> writer->CloseCollection() call which lets you write the tail of the file 
> after a grid collection has been written.
> 
> 3) Added a flag InputsArePieces - this allows me to read N blocks 
> (typically 64 from 64 processors) one by one, call addinput as follows
> 
> writer->setInputsArePieces
> writer->setAppendGridsToDomain
> writer->SetFullGridSize(x,y,x)
> writer->SetCollectionType("TemporalCollection")
> loop over time steps
>  writer->SetHeavyDataSetName(hdf5 file name generated from prefix + time 
> step)
>  writer->SetTimeValue()
>  loop over blocks
>    read block
>    writer->AddInput()
>  end block loop
>  writer->write
> end loop time
> writer->CloseCollection()
> 
> On each write, the blocks are written into the same HDF5 file using the 
> dataspace generated from the Extent of the data. I have only implemented 
> this for vtkImageData so far. All blocks are written into one hdf5 file 
> - and one hdf5 file is created per time step in this scenario.
> 
> 4) Due to a problem with discovering Extents and wholeExtents (from a 
> single piece of data) I posted to the list last week, I have added 
> FullGridSize (Vector 3 int) set get macro so that when pieces are added, 
> the whole extent is known and the dataspaces can be computed correctly. 
> Without this, a piece with extent 0,11,0,11,0,11 does not know if it is 
> part of a 0,64,0,64,0,64 or some other original size and the correct 
> dataspace cannot be inferred. (assumes vtkImageData currently, c.f. 
> discussion about extents of pv list)
> 
> 5) Bug fix. There was an error in one of the Data copy routines from 
> vtkDataArray to Xdmf array which munged variables with vector fields. 
> Scalars were ok, but vectors wrong. Fixed it.
> 
> The (time/collection/block) additions above for the Writer have allowed 
> me to generate one file per time step of contiguous image blocks from a 
> simulation dataset which is split into many sub blocks - and overall 
> make life much easier for me to process the data on a variety of 
> configurations. (by which I mean reading 64 block on 8 processors is ok, 
> but on 9, is not so good, and passing ghost info from one block to the 
> next is painful, but when stored in a single file, getting the overlap 
> cells is easy on any number of processors).
> 
> The writer changes should also work if we have parallel writes going to 
> the same HDF5 file. We can re-use the data space generation so that all 
> pieces are correctly written. The code is actually quite small, but is 
> mesy because each time the file is closed and opened between grid 
> writes, the dataspace structers are reset in the constructor of the Xdmf 
> objects. etc etc...
> 
> 
> vtkXdmfReader
> -------------------
> 1) When the reader encounters an Xdmf file with Grid type = 
> "TemporalCollection", it reads the time values from each grid and sets 
> the TIME_STEPS and TIME_RANGE during UpdateInformation, internally it 
> sets its output type to TemporalDataSet, but due to limitations of the 
> way these dataset are handled by the executives, it actually exports a 
> true vtkImage/vtkUnstructuredGrid/etc during RequestDataObject. This is 
> actually the preferred was of handling it since ImageData exported 
> downstream can be connected to image filters, whereas TemporalDataset 
> can only be connected to composite inputs. The Actual Type of dataset 
> created depends on the type of data in the xdmf file. Temporal 
> collections of MultiGroup data are supported, but so far untested. The 
> data tyope can change between timne steps if necessary, but is also 
> untested.
> 1b) The description above does not preclude being able to generate 
> multiple time steps simultaneously, if the UPDATE_TIME_STEPS has been 
> set to a vector of values, then we arrange for the executive to place a 
> TemporalDataset on the output which we can fill with data during 
> RequestDataObject. I will need to double check the order of events 
> between passes to ensure that this is possible and we may need to add a 
> new Information key to tell the executive that multiple steps can be 
> generated. Previously we assumed thata temporal dataset could be output, 
> but in practise this confuses the pieplie as a dataobject is allowed to 
> be either a dataset or a multigroup dataset and the logic in the 
> executive gets confused if the dataobject can switch between all 3 of 
> dataset/multigroup/temporal. If you read this and do not understand wht 
> I'm talking aboiut, then you don't need to know and can ignore it, but 
> Ken Moreland (for one) knows what I mean. Adding an extra key could be a 
> solution to this problem. Discussion another day.
> 
> 2) When Request Data occurs, if UPDATE_TIME_STEPS is set and time steps 
> exist, the correct data set is returned for the time requested.
> So far only discrete time steps can be stored/retrieved and so it is not 
> yet possible to specify a grid that is statice for times between a<b and 
> then changes to something else between c<d etc etc. But this will be a 
> goal in the future.
> 
> 3) For convenience, SetTimeStep will allow the user to loop over N time 
> steps (in raw vtk style code) and output each successive dataset, but if 
> UPDATE_TIME_STEPS is set by the executive (via the paraview GUI for 
> example), it will override the manually set TimeStep.
> 
> 4) Time collection of data in XDMF will all have the same grid name (eg 
> zone1 at time 1,2,3,4,5) - this currently breaks the xdmf reader, so I 
> renamed the grids with zone1#000 zone#001, the names go into a std::map 
> structure, which would break if we used the same grid name each time. I 
> suspec that this map could be modified to a list because in practice the 
> grid name is not used anywhere else I don't think apart from the GUI 
> enable/disable and we could put in another mechanism for checking. So 
> currently I broke this but I doubt it matters.
> 
> There are probably some other changes, that I will remember after I've 
> sent this.
> 
> Summary
> ------------
> I have no idea how other people are using vtkXdmfReader/Writer so I 
> really have no idea how much my changes will affect others. I doubt it 
> will break much and I'll be happy to collaborate on integrating changes.
> 
> Time : The implementation I've made for writing multiple time steps out, 
> and reading them back is working - but I'm not sure it is how the 
> implementation was intended to be from the xdmf list postings made by 
> Jerry a week or two back. My implementation allows me to basically write 
> T grids one after the other with a single <time value ="blah"> set in 
> each grid and no <time values = "list of values"> in the parent grid.
> 
> I'll hold off committing anything until I receive feedback. My next task 
> will be to ensure that when I read data back from N processors, the Xdmf 
> Reader does what it is supposed to do. Currently it does not AFAIK. 
> (crashes when I try it, but not debugged anything yet).
> 
> JB
>