[vtkusers] parallel mesh reading and writing

Wed Jan 24 12:46:40 EST 2007

On Jan 24, 2007, at 2:09 AM, John Biddiscombe wrote:

> Karl,
>
> I've been writing particle data from vtk using HDF5 in parallel and  
> I would highly recommend HDF5 as a starting point for meshes. Here  
> are two reasons (but keep reading for objections later)
> 1) HDF5 is going to be supported during our lifetimes and probably  
> longer than that. The fact that the creators supply an API which  
> enables anyone to easily query and extract the data makes it a  
> winner (much like netCDF in the past)
> 2) Writing in parallel can produce a single HDF5 file (if you want)  
> and not a collection of files which are indexed by processor - This  
> makes the visualization of data afterwards an order of magnitude  
> easier if the number of processors is different for visualization  
> than it was for generation. (vtk's XML collection format does allow  
> you to do this too - at the expense of sometimes some wasted memory  
> as points or cells might be duplicated).
>
> The vtk XML based multiblock/multifile collection data is very good  
> - and there's a lot to be said for using it. You can get a program  
> up and running in a day which will write N files out from N  
> processes and a single vtm file from process zero which references  
> all the blocks. ParaView will then be able to read the data in  
> directly and the amount of work by you will be minimal.
>
> However if you are going to do the job properly you'll want to  
> worry about points being duplicated between processes and having a  
> single master index of points,cells and the rest. What tends to  
> happen with the vtk XML collection files is that unless you are  
> careful, points might be present in multiple blocks and be written  
> out from multiple machines causing duplication. ParaView itself is  
> pretty bad in this respect when you extract datasets from  
> multiblock data and writes them out - you end up with all the  
> points from the entire collection of datasets being written when in  
> fact only a small subset of them are used by the block you are  
> writing.
>
snip ---
>
> I started this email with the intention of recommending HDF5 - but  
> I'm ending it with vtk's XML collection as my preferred choice as  
> it will be less work for complex stuff - and the readers and  
> writers already exist. If your data is already in vtk form, then  
> the xml will save you a lot of work.
>
> I welcome input from anyone else considering putting meshes into hdf5

John,

    Thank you for your very detailed review of the subject.  Indeed,  
I have to concur with yours and Kevin's comments (earlier on the  
thread)  using the vtkXMLP* classes makes it very easy.  We did  
indeed get an implementation that worked within a couple of hours and  
that is very appealing.  Nonetheless, I still see some major  
advantages to dealing with one file instead of 1700 files as you  
mention in your later post.   So . . . in your experience, how  
difficult would it be to write a general  
vtkHDF5UnstructuredGridReader/Writer?   Would you be willing to  
contribute to such a project?    The vtkExodus reader already  
introduces a dependency on netcdf and supposedly the next version of  
netcdf will be a layer on top of HDF5.  So, there may be an implicit  
dependency on HDF5 within vtk in the future.

     Karl