[Paraview] Parallel XDMF Reader and Distributed Parallel Render issues

Tue Aug 25 10:08:55 EDT 2009

Hi all:

    I got some questions when I visualized my datasets (data format is HDF5, using xdmf metadata file to organize). My datasets consist of 30 HDF5 data files, and all files stored in one directory. I wrote a xdmf file(.xmf) to organize these raw data. The .xmf file’s content as follows:

<?xml version="1.0" ?>

<!DOCTYPE Xdmf SYSTEM "Xdmf.dtd" []>

<Xdmf> 

 <Domain>

   <Grid Name="BX0" GridType="Uniform">

     <Topology TopologyType="3DCoRectMesh" Dimensions="50 80 800"/>

     <Geometry GeometryType="ORIGIN_DXDYDZ">

       <DataItem  Name="Origin" DataType="Float" Dimensions="3" Format="XML">

        0 0 0

       </DataItem>

       <DataItem Name="Spacing" DataType="Float" Dimensions="3" Format="XML">

        1 1 1

       </DataItem>

      </Geometry>

      <Attribute Name="Scalar" Type="Scalar" Center="Node">

         <DataItem   Name="Points"   Dimensions="50 80 800"    Format="HDF">

            fields-10.000-0:/Fields/bx

         </DataItem>

     </Attribute>

   </Grid>

……# the other 29 Grids are the same with the first Grid, except the file name # and the coordinate parameters

</Domain>

</Xdmf>

The first question is: Should I use “Data Collection” or other structures like “Tree”, “Subset” to organize the Grids? Does the organization of Grids affect the performance of vtkXdmfReader? I also want to know whether the class vtkXdmfReader read the raw HDF5 data in parallel when I run ParaView in parallel mode, and if reading is in parallel, how does it work in detail?

I run paraview on an cluster with 8 nodes, and I type the command “mpirun –np 8 ./pvserver” on the first node to start the server, which allocate each node one server process. Then I also start paraview on the first node using “./paraview”, and connect to the server.

I have read << The ParaView Guide>>, and it says that when paraview run in the Client/Server mode, render server and data server will all run on the server end. But when running the Client/Server mode on my cluster, it seems that only the data server is distributed working on the 8 nodes and the render server does not work(only Collection but not sort last render). Here are some time logs recording the whole process.

Local Process

Still Render,  0.022987 seconds

    Execute vtkMPIMoveData id: 1563,  0.019109 seconds

Still Render,  82.7161 seconds

    Execute vtkMPIMoveData id: 1563,  0.014251 seconds

    Execute vtkMPIMoveData id: 1843,  76.0409 seconds

Still Render,  2.77689 seconds

Still Render,  2.77948 seconds

Server, Process 0

Execute vtkXdmfReader id: 1303,  0.893424 seconds

Execute vtkPVGeometryFilter id: 1394,  0.024707 seconds

Execute vtkMPIMoveData id: 1563,  0.011097 seconds

Execute vtkContourFilter id: 1711,  5.41831 seconds

………………# also the vtkContourFilter time

Execute vtkContourFilter id: 1711,  0.189248 seconds

Execute vtkPVGeometryFilter id: 1729,  2.45381 seconds

Execute vtkMPIMoveData id: 1843,  61.2669 seconds

    Dataserver gathering to 0,  39.0038 seconds

    Dataserver sending to client,  22.2626 seconds

Server, Process 1

Execute vtkXdmfReader id: 1303,  0.884488 seconds

Execute vtkPVGeometryFilter id: 1394,  0.026233 seconds

Execute vtkContourFilter id: 1711,  5.40355 seconds

………………# also the vtkContourFilter time

Execute vtkContourFilter id: 1711,  0.173221 seconds

Execute vtkPVGeometryFilter id: 1729,  2.41831 seconds

Execute vtkMPIMoveData id: 1843,  3.25894 seconds

    Dataserver gathering to 0,  3.25879 seconds

Server, Process 2

………………………

………………………

Server, Process 7

………………

#the other Server process are same as process 1

From the time log, I only see the time record related to data server, but there are not render server. It seems that all render works are just done on the Local Process, and the other processes only take responsibility for raw data processing, producing geometry and transferring geometry to Local Process.

       In additional, I tried to run paraview in Client/Server mode respectively on 1 node, 4 nodes and 8 nodes. I found that the more I use the nodes, the longer the render time is. It depressed me seriously.

Do I run the Client/Server mode correctly? Why the performance continued to decline when more nodes joined to the server? Is it related to the “Setting->Render View->Server->Remote Render Threshold”? But my data is large(about 4GB), and even I cancel this Threshold, there are no improvement for the render time. Could you tell me how to make paraview render in distributed parallel?

Thank you for your helpJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.paraview.org/pipermail/paraview/attachments/20090825/266c4da9/attachment-0001.htm>