[Paraview] [EXTERNAL] Re: Need advice with parallel file format

Mohammad Mirzadeh mirzadeh at gmail.com
Mon May 5 14:18:39 EDT 2014


Well I was under the impression that only .pXXX (.pvtu, .pvtr, etc) are
read in parallel. Is this not the case?


On Mon, May 5, 2014 at 10:57 AM, David E DeMarle
<dave.demarle at kitware.com>wrote:

> We typically access hdf5 content through xdmf or occasionally through
> exodus (with a netcdf4 on hdf5 path).
> In either case it is a bug if one rank reads the whole file.
>
> If you are going through xdmf, please send me or the list the xml and I
> might be able to suggest improvements to better coerce paraview into
> loading it in parallel.
>
>
>
> David E DeMarle
> Kitware, Inc.
> R&D Engineer
> 21 Corporate Drive
> Clifton Park, NY 12065-8662
> Phone: 518-881-4909
>
>
> On Mon, May 5, 2014 at 1:50 PM, Mohammad Mirzadeh <mirzadeh at gmail.com>wrote:
>
>> Well the issue with HDF5 as a vis format is paraview does not seem to be
>> able to load it in parallel. Instead, it seems that paraview loads the
>> whole file on rank 0 and then tries to broadcast to other processors. This
>> would significantly limit the size of vis file ...
>>
>>
>> On Sat, May 3, 2014 at 7:06 AM, Moreland, Kenneth <kmorel at sandia.gov>wrote:
>>
>>>  Reason 1 applies just as much to Vis as restart. More so as you
>>> usually do Vis on a different number of processors than the sim.
>>>
>>>  You may want to rethink reason 2. It may not seem like much now, but
>>> restarts can take a significant proportion of the run time. And ironically
>>> the longer they take, the more you have to run them (there is a whole
>>> theory behind that).
>>>
>>>  At any rate, lots do smart people have worked (and continue to work)
>>> on making io libraries like HDF5 fast. In general, I would expect HDF5 to
>>> be much faster than the VTK formats.
>>>
>>>
>>> -Ken
>>>
>>>  Sent from my iPad so blame autocorrect.
>>>
>>> On May 2, 2014, at 10:35 PM, "Mohammad Mirzadeh" <mirzadeh at gmail.com>
>>> wrote:
>>>
>>>   I don't have any specific plan for that but the rationale for using
>>> HDF5 for restart is twofold:
>>>
>>>  1) The restart file could be read later by a different set of
>>> processors and preferably include useful meta information about the run
>>> (date/time, SHA1 of git commit, run parameters, etc)
>>> 2) The restart file is assumed to be written less frequently compared to
>>> vis and thus performance loss should not be a big issue (hopefully)
>>>
>>>  Also, parallel loading of vis file is necessary as ParaView seems to
>>> default to loading everything on rank 0 which would severely limit the size
>>> of vis file (I'd be happy to be proven wrong on this one). All that said,
>>> I'm willing to move away from HDF5 if that proves to be too costly for
>>> restart files as well. It just seems to me, after two days of searching
>>> online, that working with parallel HDF5 (and MPI-IO in general) is tricky
>>> and subject to performance loss and large number of processors. (Again I'd
>>> be happy to learn from others' experience here)
>>>
>>>
>>> On Fri, May 2, 2014 at 7:12 PM, Moreland, Kenneth <kmorel at sandia.gov>wrote:
>>>
>>>> What are you doing for your restart files? You said those are HDF5 and
>>>> they must be at least as large as anything you output for Vis. Presumably
>>>> you got that working pretty well (or are committed to getting it to work
>>>> well). Why not write the Vis output similarly?
>>>>
>>>> -Ken
>>>>
>>>> Sent from my iPad so blame autocorrect.
>>>>
>>>> > On May 2, 2014, at 6:50 PM, "Mohammad Mirzadeh" <mirzadeh at gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi I am at a critical point in deciding I/O format for my
>>>> application. So far my conclusion is to use parallel HDF5 for restart files
>>>> as they are quite flexible and portable across systems.
>>>> >
>>>> > When it comes to visualization, however, i'm not quite sure. Up until
>>>> now I've been using pvtu along with vtu files and although they generally
>>>> work fine, one easily gets in trouble when running big simulations on large
>>>> number of processors as the number of files can easily get out of control
>>>> and even simplest utility commands (e.g. ls) takes minutes to finish!
>>>> >
>>>> > After many thinking I've come to a point to decide between two
>>>> strategies:
>>>> >
>>>> > 1) Go with a single parallel HDF5 file that includes data for all
>>>> time-steps. This makes it all nice and portable except there are two
>>>> issues. i) It looks like doing MPI-IO might not be as efficient as separate
>>>> POSIX IO, especially on large number of processors. ii) ParaView does not
>>>> seem to be able to read HDF5 files in parallel
>>>> >
>>>> > 2) Go with the same pvtu+vtu strategy except take precautions to
>>>> avoid file explosions. I can think of two strategies here: i) use nested
>>>> folders to separate vtu files from pvtu and also each time step ii) create
>>>> an IO group communicator with much less processors that do the actual IO.
>>>> >
>>>> > My questions are 1) Is the second approach necessarily more efficient
>>>> than MPI-IO used in HDF5? and 2) Is there any plan to support parallel IO
>>>> for HDF5 files in paraview?
>>>>  > _______________________________________________
>>>> > Powered by www.kitware.com
>>>> >
>>>> > Visit other Kitware open-source projects at
>>>> http://www.kitware.com/opensource/opensource.html
>>>> >
>>>> > Please keep messages on-topic and check the ParaView Wiki at:
>>>> http://paraview.org/Wiki/ParaView
>>>> >
>>>> > Follow this link to subscribe/unsubscribe:
>>>> > http://www.paraview.org/mailman/listinfo/paraview
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the ParaView Wiki at:
>> http://paraview.org/Wiki/ParaView
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.paraview.org/mailman/listinfo/paraview
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20140505/915e90f5/attachment.html>


More information about the ParaView mailing list