[Paraview] Need advice with parallel file format

Burlen Loring burlen.loring at gmail.com
Mon May 5 14:22:00 EDT 2014


Hi Mohammad,

This is not an answer to your question but there is a usage caveat w/ 
VTK XML files that I want to make sure you're aware of.  When you use 
that format make sure you set mode to "appended" and "encode" off. This 
is the combination to produce binary files which are going to be faster 
and very likely smaller too. You probably already know that, but just in 
case ...

now to get to your question:
> 1) Go with a single parallel HDF5 file that includes data for all 
> time-steps. This makes it all nice and portable except there are two 
> issues. i) It looks like doing MPI-IO might not be as efficient as 
> separate POSIX IO, especially on large number of processors. ii) 
> ParaView does not seem to be able to read HDF5 files in parallel
comment: If I were you I'd avoid putting all time steps in a single 
file, or any solution where files get too big. Once files occupy more 
than ~80% of a tape drive you'll have very hard time getting them on and 
off archive systems. see this: 
http://www.nersc.gov/users/data-and-file-systems/hpss/storing-and-retrieving-data/mistakes-to-avoid/ 
My comment assumes that you actually use such systems. But you probably 
will need to if you generate large datasets at common HPC centers.

I've seen some AMR codes get elaborate in their HDF5 formats and run 
into serious performance issues as a result. So my comment here is that 
if you go with HDF5, keep the format as simple as possible! and of 
course file sizes small enough to be archived ;-)

Burlen


On 05/05/2014 10:48 AM, Mohammad Mirzadeh wrote:
> They are represented as unstructured grid. As a sample run, a 100M 
> grid point on 256 proc produces almost 8.5G file. We intent to push 
> the limits close to 1B at most at this time with # processors up to a 
> few thousands. However, it would be good to have something that could 
> scale to larger problems as well
>
>
> On Sat, May 3, 2014 at 1:28 AM, Stephen Wornom 
> <stephen.wornom at inria.fr <mailto:stephen.wornom at inria.fr>> wrote:
>
>     Mohammad Mirzadeh wrote:
>
>         Hi I am at a critical point in deciding I/O format for my
>         application. So far my conclusion is to use parallel HDF5 for
>         restart files as they are quite flexible and portable across
>         systems.
>
>         When it comes to visualization, however, i'm not quite sure.
>         Up until now I've been using pvtu along with vtu files and
>         although they generally work fine, one easily gets in trouble
>         when running big simulations on large number of processors as
>         the number of files can easily get out of control and even
>         simplest utility commands (e.g. ls) takes minutes to finish!
>
>         After many thinking I've come to a point to decide between two
>         strategies:
>
>         1) Go with a single parallel HDF5 file that includes data for
>         all time-steps. This makes it all nice and portable except
>         there are two issues. i) It looks like doing MPI-IO might not
>         be as efficient as separate POSIX IO, especially on large
>         number of processors. ii) ParaView does not seem to be able to
>         read HDF5 files in parallel
>
>         2) Go with the same pvtu+vtu strategy except take precautions
>         to avoid file explosions. I can think of two strategies here:
>         i) use nested folders to separate vtu files from pvtu and also
>         each time step ii) create an IO group communicator with much
>         less processors that do the actual IO.
>
>         My questions are 1) Is the second approach necessarily more
>         efficient than MPI-IO used in HDF5? and 2) Is there any plan
>         to support parallel IO for HDF5 files in paraview?
>
>
>         _______________________________________________
>         Powered by www.kitware.com <http://www.kitware.com>
>
>         Visit other Kitware open-source projects at
>         http://www.kitware.com/opensource/opensource.html
>
>         Please keep messages on-topic and check the ParaView Wiki at:
>         http://paraview.org/Wiki/ParaView
>
>         Follow this link to subscribe/unsubscribe:
>         http://www.paraview.org/mailman/listinfo/paraview
>
>     Are your meshes structured or unstructured? How many vertices in
>     your meshes?
>
>     Stephen
>
>     -- 
>     stephen.wornom at inria.fr <mailto:stephen.wornom at inria.fr>
>     2004 route des lucioles - BP93
>     Sophia Antipolis
>     06902 CEDEX
>
>     Tel: 04 92 38 50 54
>     Fax: 04 97 15 53 51
>
>
>
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20140505/bacebc34/attachment-0001.html>


More information about the ParaView mailing list