[Paraview] [EXTERNAL] Re: Need advice with parallel file format

Moreland, Kenneth kmorel at sandia.gov
Sat May 3 10:06:32 EDT 2014


Reason 1 applies just as much to Vis as restart. More so as you usually do Vis on a different number of processors than the sim.

You may want to rethink reason 2. It may not seem like much now, but restarts can take a significant proportion of the run time. And ironically the longer they take, the more you have to run them (there is a whole theory behind that).

At any rate, lots do smart people have worked (and continue to work) on making io libraries like HDF5 fast. In general, I would expect HDF5 to be much faster than the VTK formats.

-Ken

Sent from my iPad so blame autocorrect.

On May 2, 2014, at 10:35 PM, "Mohammad Mirzadeh" <mirzadeh at gmail.com<mailto:mirzadeh at gmail.com>> wrote:

I don't have any specific plan for that but the rationale for using HDF5 for restart is twofold:

1) The restart file could be read later by a different set of processors and preferably include useful meta information about the run (date/time, SHA1 of git commit, run parameters, etc)
2) The restart file is assumed to be written less frequently compared to vis and thus performance loss should not be a big issue (hopefully)

Also, parallel loading of vis file is necessary as ParaView seems to default to loading everything on rank 0 which would severely limit the size of vis file (I'd be happy to be proven wrong on this one). All that said, I'm willing to move away from HDF5 if that proves to be too costly for restart files as well. It just seems to me, after two days of searching online, that working with parallel HDF5 (and MPI-IO in general) is tricky and subject to performance loss and large number of processors. (Again I'd be happy to learn from others' experience here)


On Fri, May 2, 2014 at 7:12 PM, Moreland, Kenneth <kmorel at sandia.gov<mailto:kmorel at sandia.gov>> wrote:
What are you doing for your restart files? You said those are HDF5 and they must be at least as large as anything you output for Vis. Presumably you got that working pretty well (or are committed to getting it to work well). Why not write the Vis output similarly?

-Ken

Sent from my iPad so blame autocorrect.

> On May 2, 2014, at 6:50 PM, "Mohammad Mirzadeh" <mirzadeh at gmail.com<mailto:mirzadeh at gmail.com>> wrote:
>
> Hi I am at a critical point in deciding I/O format for my application. So far my conclusion is to use parallel HDF5 for restart files as they are quite flexible and portable across systems.
>
> When it comes to visualization, however, i'm not quite sure. Up until now I've been using pvtu along with vtu files and although they generally work fine, one easily gets in trouble when running big simulations on large number of processors as the number of files can easily get out of control and even simplest utility commands (e.g. ls) takes minutes to finish!
>
> After many thinking I've come to a point to decide between two strategies:
>
> 1) Go with a single parallel HDF5 file that includes data for all time-steps. This makes it all nice and portable except there are two issues. i) It looks like doing MPI-IO might not be as efficient as separate POSIX IO, especially on large number of processors. ii) ParaView does not seem to be able to read HDF5 files in parallel
>
> 2) Go with the same pvtu+vtu strategy except take precautions to avoid file explosions. I can think of two strategies here: i) use nested folders to separate vtu files from pvtu and also each time step ii) create an IO group communicator with much less processors that do the actual IO.
>
> My questions are 1) Is the second approach necessarily more efficient than MPI-IO used in HDF5? and 2) Is there any plan to support parallel IO for HDF5 files in paraview?
> _______________________________________________
> Powered by www.kitware.com<http://www.kitware.com>
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20140503/fb3a7cc7/attachment.html>


More information about the ParaView mailing list