[Paraview] Parallel file formats (again)... (UNCLASSIFIED)
Kent Eschenberg
eschenbe at psc.edu
Fri Jun 13 16:19:08 EDT 2008
You are right: write 1 XML file that references all HDF5 files. For example, process 0 writes 1 XML file with 4096 grids, each describing the HDF5 file from one process.
Kent
Pittsburgh Supercomputing Center
Renato N. Elias wrote:
> Hi Jerry,
>
> As I described in my last email, I managed to compile and link the
> Fortran example supplied with the XDMF library (in fact, I changed the
> example a little bit writing a Fortran interface explicitly to call the
> extern "C" routine name). It gave me 10 XDMF and 10 HDF5 files. Each
> XDMF pointing to a corresponding HDF5. Ok, everything easy and
> running... now, I'm guessing, from a naive approach, that I could just
> follow the same example for a parallel running where each process,
> having its own piece of the domain, would write its data in pairs of
> files XDMF/HDF5. Thus, for 2 processes, I'd have something like:
>
> Iteration 1
> Demo_00001.00.xmf --> Demo_00001.00.h5
> Demo_00001.01.xmf --> Demo_00001.01.h5
>
> Iteration 2
> Demo_00002.00.xmf --> Demo_00002.00.h5
> Demo_00002.01.xmf --> Demo_00002.01.h5
> ...
> and so on
>
> Maybe, all 20 XDMF files in this example could be replaced by just one
> containing all iterations and processes information, but, I haven't
> found it explained on XDMF's wiki.
>
> Do you have a simple example of parallel XDMF file? Moreover, Is
> ParaView able to load transient parallel files in XDMF?
>
> thanks
>
> Renato.
>
>
> Clarke, Jerry (Civ, ARL/CISD) wrote:
>> Classification: UNCLASSIFIED Caveats: NONE
>>
>> Renato,
>>
>> www.arl.hpc.mil/ice doesn't exist anymore.
>>
>> Let me know what you are trying to do and I can provide an example.
>> If you're trying to work in parallel, you probably want to use
>> <Grid GridType="Collection" CollectionType="Spatial" ...
>> <Grid Name="Grid from node 0"
>> <Grid Name="Grid from node 1"
>> etc.
>> </Grid>
>>
>> For the non 0 started arrays, there is an Offset="1" option in the XML
>>
>> As for the row major / column major arrays, I link C++ to Fortran, but
>> we have been talking about putting a transpose method in the DataItem
>> object.
>>
>>
>> Jerry Clarke
>>
>> -----Original Message-----
>> From: paraview-bounces at paraview.org
>> [mailto:paraview-bounces at paraview.org] On Behalf Of Renato N. Elias
>> Sent: Friday, June 13, 2008 1:55 PM
>> To: Dominik Szczerba
>> Cc: paraview at paraview.org
>> Subject: Re: [Paraview] Parallel file formats (again)...
>>
>>
>> It seems that XDMF is the unique way to go if we'd like to get parallel
>> data into ParaView using HDF5. No problem since XDMF can be seen as a
>> HDF5 extension. In fact, I thought about doing exactly what you cited --
>> use the HDF5 API and write simple XML/XDMF files using Fortran, no
>> matter if my data is heavy or light.
>>
>> The problem with XDMF is the lack of information. The first link pointed
>> by Google is always down (www.arl.hpc.mil/ice/) and the "official" wiki
>> site doesn't offer so much. I already got something on doing Fortran
>> talk with XDMF (we can also write interfaces from the Fortran side to
>> talk with C-like routine names) but, now, I'd like to go a bit further
>> and use parallelism but there's no examples covering the subject using
>> Fortran. My chance is debugging C++ examples and try to make some
>> correlation.
>>
>> Regarding row-major order and non 0 started arrays I can't say anything.
>>
>> I only say that, for Fortran programmers, it's getting a bit harder to
>> work without having to deal with C++ and all that OOP stuffs. In this
>> sense, I love Metis, so powerful, so easy, so fast, so simple and
>> everything written in C ANSI. Just minor efforts to get it working with
>> Fortran. As we say in Brazil, sometimes people like to kill cockroaches
>> using bazookas instead of flip-flops... for writing files*my guess* is
>> that straight C would do the job nicely
>>
>> Dominik, the problem with Fortran is that everybody associates it with
>> 77 (just that old programming language). Maybe, they should change the
>> name of the language from Fortran 2003 to F++ ;oP
>>
>> Renato.
>>
>> Dominik Szczerba wrote:
>>
>>> And how would he handle hard-coded row-major ordering in XDMF?
>>> -- Dominik
>>>
>>> Chris Kees wrote:
>>>
>>>> You might want to reconsider XDMF or something based on it. I'm not
>>>> sure that XDMF is significantly harder to implement in fortran than
>>>> straight HDF5. It's just a matter of doing some additional text i/o
>>>> on a relatively simple XML file. XDMF splits the data (with some
>>>> redundancy) into light/meta data stored as simple XML (ascii) file
>>>> and an HDF5 archive of the "heavy" data. You can read and write the
>>>> XML file directly from fortran without using the XDMF library and
>>>> then use the HDF5 fortran API directly to write the heavy data. You
>>>>
>>
>>
>>>> have the option of storing the heavy data in the XML file as text
>>>> when HDF5 isn't available (or when debugging/running on small
>>>> data). To me it looks like the posts you cite are pointing in this
>>>>
>>
>>
>>>> direction though they were unhappy with some aspects of XDMF. It's
>>>> not clear to me whether it's the XDMF xml format, the documentation
>>>> of that format, or the C API that needs work in order to make it more
>>>>
>>
>>
>>>> useful.
>>>> Also, it sounds like you've already decided against a mixed language
>>>> approach, but the the book by H. P. Langtangen "Python Scripting for
>>>> Computational Science" advocates a fortran/python pairing to deal
>>>> with some of your general concerns.
>>>> Chris
>>>> * *
>>>> On Jun 13, 2008, at 7:58 AM, Renato N. Elias wrote:
>>>>
>>>>
>>>>> Can anyone shed some light above how is the support status for
>>>>> parallel file formats in ParaView?
>>>>>
>>>>> In my lab most of the students still work with Fortran. It seems
>>>>> that "the universe nowadays only speaks C++ (and Python for
>>>>> scripting)" which force us to do an extensive evaluation for a good
>>>>> and well supported parallel file format to invest before struggling
>>>>> with all that mixed languages interface/wrapping annoyances (not
>>>>> everybody working with programs are programmers, there's still some
>>>>> engineers like civil, mechanical, chemical, etc... doing
>>>>>
>> science...).
>>
>>>>> I could say that our my concerns about choosing a file format to
>>>>> sticky with is:
>>>>>
>>>>> -- Easiness for installation and use (in this sense, Ensight is
>>>>> wonderful since we don't need extra libraries. It's insane when we
>>>>> need to compile 50 MB of libraries to link with a 2 MB program that
>>>>> uses just one routine of such library);
>>>>> -- Easiness for interfacing (most of the libraries nowadays is
>>>>> written in C++ for C++ programmers which discourage its use by C and
>>>>>
>>
>>
>>>>> Fortran programs. Ok, we can always spend some time in interfacing
>>>>> it, but, a library should offer more functionality and flexibility
>>>>> than annoyances)
>>>>> -- Portability.
>>>>>
>>>>> Some time ago there was some interesting posts from Jean Favre and
>>>>> Dominic about this, which give us some overview about the subject.
>>>>>
>>>>> http://www.paraview.org/pipermail/paraview/2008-May/008070.html
>>>>> http://www.paraview.org/pipermail/paraview/2008-May/008071.html
>>>>>
>>>>> My 2 cents for the discussion, *from a Fortran perspective*, is:
>>>>>
>>>>> 1). ENSIGHT:
>>>>> 1.1. Quite simple to implement and use (no need for extra libraries
>>>>> and all that stuff. Just a few Fortran statements do the job); 1.2.
>>>>> Implicit support for transient data and parallelism; 1.3. Depending
>>>>> on the number of processes we might have a huge number of
>>>>> small/medium files since each point and cell data variable is stored
>>>>>
>>
>>
>>>>> in one file (sometimes it can be a serious problem); 1.4. Not
>>>>> compressed (too bad); 1.5. Not so well supported *as a parallel
>>>>> format* by ParaView yet.
>>>>> After the change to deal (after PV 2.2.1) with multigroup datasets
>>>>> some functionalities were lost until reimplementation.
>>>>> 1.6. Supported by ParaView, Visit and Ensight (of course)
>>>>>
>>>>> 2). XML/VTK:
>>>>> 2.1. Almost impossible for a Fortran user to implement, so, we're
>>>>> forced to interface with VTK in order to write something; 2.2. Time
>>>>> series support has been introduced in some sense ;o) 2.3. It's a bit
>>>>>
>>
>>
>>>>> complicated to understand. Ok, it's XML and we should use it (and
>>>>> believe on it ;o) ) through some library, so, it's not supposed to
>>>>> "hand-implementation"; 2.4. Encoding/compression is supported (which
>>>>>
>>
>>
>>>>> is really good) 2.5. It should be the most well parallel file format
>>>>>
>>
>>
>>>>> supported by ParaView (after EXODUS, maybe) 2.6. Only supported by
>>>>> VTK based softwares (ParaView, Visit, MayaVi)
>>>>>
>>>>> 3). XDMF/HDF5:
>>>>> 3.1. Same as 2.1, 2.2 and 2.3
>>>>> 3.2. The website describing the library is a bit down lately...
>>>>> 3.3. HDF5 seems a very promising file format. It has some
>>>>> development concern about its use by other scientific languages
>>>>> besides being flexible, compressed, cross platform, etc... .
>>>>> 3.4. From my knowledge, XDMF is supported by Ensight, ParaView and
>>>>> Visit also --> not sure about how good is that support.
>>>>>
>>>>> 4). EXODUS II:
>>>>> 4.1. Same as 2.1 --> I already tried more than once to find
>>>>> something about Exodus format. There's a good documentation in
>>>>> SANDIA/SEACAS page but the library is not open source (it's a
>>>>> license based distribution) which turns it a bit complicated to
>>>>> adopt; 4.2. Nothing to say about timea nd compression support since
>>>>> I never used it; 4.3. It must be well supported by PV since it's a
>>>>> Sandia's format;
>>>>>
>>>>> regards
>>>>>
>>>>> Renato.
More information about the ParaView
mailing list