[Paraview] XDMF reader parallelism?

Fri Dec 19 17:23:27 EST 2014

Hi Karl,

Sorry for the lag in response.

We spent quite a bit of time looking at this issue when we were making the
xdmf3 reader.

The new xdmf library's treatment of the data structures that pertain to the
xml contents it much more efficient than the old version.

Also, at the ParaView level there you have two choices of how to read and
xdmf file with the new reader.

The "(Top Level Parition)" is meant for this case. It makes it so that that
every node opens its own child xdmf files. That way no memory is spent on
the xml structured for contents they are not responsible for the hdf5 data
for.

hth

David E DeMarle
Kitware, Inc.
R&D Engineer
21 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-881-4909

On Wed, Dec 3, 2014 at 7:05 PM, Karl-Ulrich Bamberg <
Karl-Ulrich.Bamberg at physik.uni-muenchen.de> wrote:
>
> Hi,
>
> Paraview is a great and handy tool. I am now trying to scale it to large
> data sets and have
> a question related to the parallel capabilities of the
> Paraview-XDMF-Reader:
>
> We use a PIC-Code that devides its grid into blocks distributed over the
> ranks.
> For the output every rank writes exactly one HDF5, collecting all local
> blocks and all time steps.
> One XDMF per rank was written describing the data, as well as a central
> XDMF including all the individual ones.
>
> The hierarchy is
> central-xdmf:  spatial_collection->includes_for_rank_xdmf_files
> rank_xdmf_files: temporal_collection->spatial_collectio_within_ranks->grids
>
> The size of the simulation is about 1024 ranks and 256 time steps but
> should be increased.
> For these parameters we see (via top) a memory consumption of 16GB per
> pvserver instance.
> Directly after opening of the file, so even before "apply".
>
> I guess that this is because all the pvserver instances read and parse the
> XDMF file?
> One time the paths to the HDF5 files were wrong, the memory consumption
> was the same.
> After "apply" there was than an error.
>
> I tried "PV_USE_TRANSMIT=1" and also changed the grid hierarchy to only
> have:
> temporal_collection->spatial_collection->grids
>
> This directly in one file, that was finally 1GB on disk and around 16GB in
> memory with lxml2 via python.
>
> But it was to no effort, still every instance was consuming around 16GB
>
> Is there any chance that pvserver can parallelize on the top-level so
> every pvserver instance only reads some of the "include" files.
> Or is there a different approach to store the grid-patches (all the same
> resolution right now) in HDF5?
>
> All suggestions are highly appreciated :-)
>
> Thank you all very much for any support,
> Best regards
> --
> Dipl.-Phys. Karl-Ulrich Bamberg
>
> Ludwig-Maximilians-Universität München
> Arnold-Sommerfeld-Center (ASC)
> Computational & Plasma Physics
> Theresienstr. 37, D-80333 München
>
> phone: +49 (0)89 2180 4577
> fax: +49 (0)89 2180 99 4577
> e-mail: Karl-Ulrich.Bamberg at physik.uni-muenchen.de
>
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at:
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://public.kitware.com/mailman/listinfo/paraview
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/paraview/attachments/20141219/3df5357f/attachment.html>