[Paraview] reading large collections of *vti files
Favre Jean
jfavre at cscs.ch
Thu Jan 7 16:53:10 EST 2010
Folks
I have engaged into a conversion project, from XML PVTI files to HDF5 files. Although very convenient to store each processor's output into its own file, and then combining the whole thing into a single PVTI file, we now have the following challenge:
1) with several tens of thousands of computing cores, we end-up with several millions of VTI files for time dependent outputs. Reading the PVTI files which assemble all the pieces works, but it is very slow because of the fragmentation.
2) doing "ls" in the directory, or using the File browser to open files is very, very slow.
My data converter currently works on sets of 16,384 VTI files, and writes the whole global image to a single HDF5 file. One challenge I have found is that using the PXMLImageDataReader seems to use twice as much memory as required. I currently convert Images of resolution 881*881*3521 for 1 vector field. That is about 32 Gbytes. When read in parallel on 14 servers, the pvservers use over 5.2 Gb of memory, pushing me very dangerously over the swap limit. Yet, 32 Gb divided by 14 should occupy 2.3 Gb. In fact, reading back the data after conversion, using the Xdmf2 reader, my pvservers use exactly 2.3 Gb. I conclude that the Xdmf reader is very clean (and very efficient btw), and that the PXML*Reader appears to use way over too much memory.
Has anyone seen this behavior with memory and can offer tips to more efficiently read these large collections? I have collections from 4K to 22K files per timesteps.
TIA
-------------------
Jean M. Favre
Scientific Computing Research
Swiss National Supercomputing Center
CH-6828 Manno
Switzerland
More information about the ParaView
mailing list