[Paraview] Parallel file formats (again)... (UNCLASSIFIED)

Jerry Clarke clarke at arl.army.mil
Fri Jun 13 17:21:32 EDT 2008


Renato,
	There are lots of ways to handle this in parallel, each with pros and 
cons. The simplest approach is what you have described; writing
one HDF5 file per node. You don't have to write out one xml file for
each node, you can just write out one xml file that pulls them all
together. (But I typically write them out anyway particularly during
debugging so I can easily look at one node's data in ParaView).

The trick is to use the "Collection" grid type. Xdmf Grids can
be "Uniform" (a homogeneous single grid), "Tree" (a hierarchical group), or
"Collection" (an array of Uniform grids all with the same Attributes)

On the Wiki, at the end :
http://www.xdmf.org/index.php/Write_from_Fortran

It describes using a collection for time :
--------------------------------------
<?xml version="1.0" ?>
<!DOCTYPE Xdmf SYSTEM "Xdmf.dtd" []>
<Xdmf xmlns:xi="http://www.w3.org/2001/XInclude" Version="2.0">
        <Domain>
            <Grid GridType="Collection" CollectionType="Temporal">
              <xi:include href="Demo_00001.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00002.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00003.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00004.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00005.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00006.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00007.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00008.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00009.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
              <xi:include href="Demo_00010.xmf" 
xpointer="xpointer(//Xdmf/Domain/Grid)" />
           </Grid>
        </Domain>
</Xdmf>
--------------------------------

All you have to do is change CollectionType="Temporal" to
CollectionType="Spatial" and it should work; ParaView
will pull in all 10 grids (one written from each node)
and assemble it into one grid. If you wanted to eliminate
the pre-node xml files, write that portion of the xml
(from <Grid..> to </Grid>)to the collection xml file
replacing the <xi:include ... syntax.


If you want this to work over time, you make a
collection of collections. The outer is
CollectionType="Temporal" which contains a
collection of CollectionType="Temporal" which are
each like the above example.

Now, having said all that, you could also try parallel
HDF5 and have a single HDF5 file. I've done that in the
past but if I remember correctly some of the operations
must be done collectively. HDF5 is very powerful; there
are many ways to do things.

Hope this helps,
Jerry


Renato N. Elias wrote:
> Hi Jerry,
> 
> As I described in my last email, I managed to compile and link the
> Fortran example supplied with the XDMF library (in fact, I changed the
> example a little bit writing a Fortran interface explicitly to call the
> extern "C" routine name). It gave me 10 XDMF and 10 HDF5 files. Each
> XDMF pointing to a corresponding HDF5. Ok, everything easy and
> running... now, I'm guessing, from a naive approach, that I could just
> follow the same example for a parallel running where each process,
> having its own piece of the domain, would write its data in pairs of
> files XDMF/HDF5. Thus, for 2 processes, I'd have something like:
> 
> Iteration 1
> Demo_00001.00.xmf  --> Demo_00001.00.h5
> Demo_00001.01.xmf --> Demo_00001.01.h5
> 
> Iteration 2
> Demo_00002.00.xmf  --> Demo_00002.00.h5
> Demo_00002.01.xmf --> Demo_00002.01.h5
> ...
> and so on
> 
> Maybe, all 20 XDMF files in this example could be replaced by just one
> containing all iterations and processes information, but, I haven't
> found it explained on XDMF's wiki.
> 
> Do you have a simple example of parallel XDMF file? Moreover, Is
> ParaView able to load transient parallel files in XDMF?
> 
> thanks
> 
> Renato.
> 
> 
> Clarke, Jerry (Civ, ARL/CISD) wrote:
>  > Classification:  UNCLASSIFIED
>  > Caveats: NONE
>  >
>  > Renato,
>  >
>  > www.arl.hpc.mil/ice doesn't exist anymore.
>  >
>  > Let me know what you are trying to do and I can provide an example.
>  >
>  > If you're trying to work in parallel, you probably want to use
>  > <Grid GridType="Collection" CollectionType="Spatial" ...
>  >       <Grid Name="Grid from node 0"
>  >       <Grid Name="Grid from node 1"
>  >       etc.
>  > </Grid>
>  >
>  > For the non 0 started arrays, there is an Offset="1" option in the XML
>  >
>  > As for the row major / column major arrays, I link C++ to Fortran, but
>  > we have been talking about putting a transpose method in the DataItem
>  > object.
>  >
>  >
>  > Jerry Clarke
>  >
>  > -----Original Message-----
>  > From: paraview-bounces at paraview.org
>  > [mailto:paraview-bounces at paraview.org] On Behalf Of Renato N. Elias
>  > Sent: Friday, June 13, 2008 1:55 PM
>  > To: Dominik Szczerba
>  > Cc: paraview at paraview.org
>  > Subject: Re: [Paraview] Parallel file formats (again)...
>  >
>  >
>  > It seems that XDMF is the unique way to go if we'd like to get parallel
>  > data into ParaView using HDF5. No problem since XDMF can be seen as a
>  > HDF5 extension. In fact, I thought about doing exactly what you cited --
>  > use the HDF5 API and write simple XML/XDMF files using Fortran, no
>  > matter if my data is heavy or light.
>  >
>  > The problem with XDMF is the lack of information. The first link pointed
>  > by Google is always down (www.arl.hpc.mil/ice/) and the "official" wiki
>  > site doesn't offer so much. I already got something on doing Fortran
>  > talk with XDMF (we can also write interfaces from the Fortran side to
>  > talk with C-like routine names) but, now, I'd like to go a bit further
>  > and use parallelism but there's no examples covering the subject using
>  > Fortran. My chance is debugging C++ examples and try to make some
>  > correlation.
>  >
>  > Regarding row-major order and non 0 started arrays I can't say anything.
>  >
>  > I only say that, for Fortran programmers, it's getting a bit harder to
>  > work without having to deal with C++ and all that OOP stuffs. In this
>  > sense, I love Metis, so powerful, so easy, so fast, so simple and
>  > everything written in C ANSI. Just minor efforts to get it working with
>  > Fortran. As we say in Brazil, sometimes people like to kill cockroaches
>  > using bazookas instead of flip-flops... for writing files*my guess* is
>  > that straight C would do the job nicely
>  >
>  > Dominik, the problem with Fortran is that everybody associates it with
>  > 77 (just that old programming language). Maybe, they should change the
>  > name of the language from Fortran 2003 to F++ ;oP
>  >
>  > Renato.
>  >
>  > Dominik Szczerba wrote:
>  >  
>  >> And how would he handle hard-coded row-major ordering in XDMF?
>  >> -- Dominik
>  >>
>  >> Chris Kees wrote:
>  >>    
>  >>> You might want to reconsider XDMF or something based on it. I'm not
>  >>> sure that XDMF is significantly harder to implement in fortran than
>  >>> straight HDF5. It's just a matter  of doing some additional text i/o
>  >>> on a relatively simple XML file. XDMF splits the data (with some
>  >>> redundancy) into light/meta data stored as simple XML (ascii) file
>  >>> and an HDF5 archive of the "heavy" data.  You can read and write the
>  >>> XML file directly from fortran without using the XDMF library and
>  >>> then use the HDF5 fortran API directly to write the heavy data.   You
>  >>>      
>  >
>  >  
>  >>> have the option of storing the heavy data in the XML file as text
>  >>> when HDF5 isn't available (or when debugging/running on small
>  >>> data).   To me it looks like the posts you cite are pointing in this
>  >>>      
>  >
>  >  
>  >>> direction though they were unhappy with some aspects of XDMF.  It's
>  >>> not clear to me whether it's the XDMF xml format, the documentation
>  >>> of that format, or the C API that needs work in order to make it more
>  >>>      
>  >
>  >  
>  >>> useful.
>  >>> Also, it sounds like you've already decided against a mixed language
>  >>> approach, but the the book by H. P. Langtangen "Python Scripting for
>  >>> Computational Science" advocates a fortran/python pairing to deal
>  >>> with some of your  general concerns.
>  >>> Chris
>  >>> *  *
>  >>> On Jun 13, 2008, at 7:58 AM, Renato N. Elias wrote:
>  >>>
>  >>>      
>  >>>> Can anyone shed some light above how is the support status for
>  >>>> parallel file formats in ParaView?
>  >>>>
>  >>>> In my lab most of the students still work with Fortran. It seems
>  >>>> that "the universe nowadays only speaks C++ (and Python for
>  >>>> scripting)" which force us to do an extensive evaluation for a good
>  >>>> and well supported parallel file format to invest before struggling
>  >>>> with all that mixed languages interface/wrapping annoyances (not
>  >>>> everybody working with programs are programmers, there's still some
>  >>>> engineers like civil, mechanical, chemical, etc... doing
>  >>>>        
>  > science...).
>  >  
>  >>>> I could say that our my concerns about choosing a file format to
>  >>>> sticky with is:
>  >>>>
>  >>>> -- Easiness for installation and use (in this sense, Ensight is
>  >>>> wonderful since we don't need extra libraries. It's insane when we
>  >>>> need to compile 50 MB of libraries to link with a 2 MB program that
>  >>>> uses just one routine of such library);
>  >>>> -- Easiness for interfacing (most of the libraries nowadays is
>  >>>> written in C++ for C++ programmers which discourage its use by C and
>  >>>>        
>  >
>  >  
>  >>>> Fortran programs. Ok, we can always spend some time in interfacing
>  >>>> it, but, a library should offer more functionality and flexibility
>  >>>> than annoyances)
>  >>>> -- Portability.
>  >>>>
>  >>>> Some time ago there was some interesting posts from Jean Favre and
>  >>>> Dominic about this, which give us some overview about the subject.
>  >>>>
>  >>>> http://www.paraview.org/pipermail/paraview/2008-May/008070.html
>  >>>> http://www.paraview.org/pipermail/paraview/2008-May/008071.html
>  >>>>
>  >>>> My 2 cents for the discussion, *from a Fortran perspective*, is:
>  >>>>
>  >>>> 1). ENSIGHT:
>  >>>> 1.1. Quite simple to implement and use (no need for extra libraries
>  >>>> and all that stuff. Just a few Fortran statements do the job); 1.2.
>  >>>> Implicit support for transient data and parallelism; 1.3. Depending
>  >>>> on the number of processes we might have a huge number of
>  >>>> small/medium files since each point and cell data variable is stored
>  >>>>        
>  >
>  >  
>  >>>> in one file (sometimes it can be a serious problem); 1.4. Not
>  >>>> compressed (too bad); 1.5. Not so well supported *as a parallel
>  >>>> format* by ParaView yet.
>  >>>> After the change to deal (after PV 2.2.1) with multigroup datasets
>  >>>> some functionalities were lost until reimplementation.
>  >>>> 1.6. Supported by ParaView, Visit and Ensight (of course)
>  >>>>
>  >>>> 2). XML/VTK:
>  >>>> 2.1. Almost impossible for a Fortran user to implement, so, we're
>  >>>> forced to interface with VTK in order to write something; 2.2. Time
>  >>>> series support has been introduced in some sense ;o) 2.3. It's a bit
>  >>>>        
>  >
>  >  
>  >>>> complicated to understand. Ok, it's XML and we should use it (and
>  >>>> believe on it ;o) ) through some library, so, it's not supposed to
>  >>>> "hand-implementation"; 2.4. Encoding/compression is supported (which
>  >>>>        
>  >
>  >  
>  >>>> is really good) 2.5. It should be the most well parallel file format
>  >>>>        
>  >
>  >  
>  >>>> supported by ParaView (after EXODUS, maybe) 2.6. Only supported by
>  >>>> VTK based softwares (ParaView, Visit, MayaVi)
>  >>>>
>  >>>> 3). XDMF/HDF5:
>  >>>> 3.1. Same as 2.1, 2.2 and 2.3
>  >>>> 3.2. The website describing the library is a bit down lately...
>  >>>> 3.3. HDF5 seems a very promising file format. It has some
>  >>>> development concern about its use by other scientific languages
>  >>>> besides being flexible, compressed, cross platform, etc... .
>  >>>> 3.4. From my knowledge, XDMF is supported by Ensight, ParaView and
>  >>>> Visit also --> not sure about how good is that support.
>  >>>>
>  >>>> 4). EXODUS II:
>  >>>> 4.1. Same as 2.1 --> I already tried more than once to find
>  >>>> something about Exodus format. There's a good documentation in
>  >>>> SANDIA/SEACAS page but the library is not open source (it's a
>  >>>> license based distribution) which turns it a bit complicated to
>  >>>> adopt; 4.2. Nothing to say about timea nd compression support since
>  >>>> I never used it; 4.3. It must be well supported by PV since it's a
>  >>>> Sandia's format;
>  >>>>
>  >>>> regards
>  >>>>
>  >>>> Renato.
>  >>>>
>  >>>>
>  >>>>
>  >>>> _______________________________________________
>  >>>> ParaView mailing list
>  >>>> ParaView at paraview.org
>  >>>> http://www.paraview.org/mailman/listinfo/paraview
>  >>>>        
>  >>> ---------------------------------------------------------------------
>  >>> ---
>  >>>
>  >>> _______________________________________________
>  >>> ParaView mailing list
>  >>> ParaView at paraview.org
>  >>> http://www.paraview.org/mailman/listinfo/paraview
>  >>>      
>  >
>  > _______________________________________________
>  > ParaView mailing list
>  > ParaView at paraview.org
>  > http://www.paraview.org/mailman/listinfo/paraview
>  > Classification:  UNCLASSIFIED
>  > Caveats: NONE
>  >
>  >  
> 



More information about the ParaView mailing list