[Paraview] Any performance hints or tips for .pvtr reader?

Mon Jun 29 12:58:12 EDT 2009

>>   That is, organize the data so that each process only has to
>> do one read from a single file.  The easiest way to do that is to read
>> in your data into ParaView or VTK on 64 processes as you are doing now
> 
> Still easier when possible is to run the viz task on 192 cores, matching
> one file per core; I've done that and performance improves but not
> enormously.   It's an inconvenience, but I just wanted to know if there
> was anything in particular I needed to be aware of.

There is no silver bullet here.  You would probably get better performance
by loading everything on one process and distributing it to the rest.  If
you have a serial file system like nfs that might be as good as you can do.
If you have a parallel file system that has at least as many independent
drives as pipes (1 per 8 cores from your previous email) and the data is
lain out conveniently, you would probably get optimal performance reading
into 8 processes and distributing to the rest.

The .pvtr reader is nowhere near sophisticated enough to handle this well
and the file format is not well suited to implement these optimizations.
There are some file format libraries that try to resolve these issues.  If
you plan to continue to generate and analyze data of this nature, you might
consider using one of these.

> 
>> (pay the time price) and then write it back out in a new .pvtr file.
>>  The new .pvtr file should contain 64 .vtr files.  Loading that back
>> onto 64 processes should be quick.
> 
> Yes, well, that's a separate issue that I'm having; I can reliably kill
> paraview dead by trying to `Save Data', either immediately after the
> pvtr data has been read as cell data or after converting it to point
> data.  If I just try to save  smaller data sets resulting from the
> visualization process (eg, generated contours) I have no problems with
> that.  There's nothing obviously weird about the data set, as I have no
> issues plotting or manipulating the data.  This was true with PV 3.4 and
> with a current (eg, this morning) cvs build.

My guess is that the operation you are choosing is requesting ghost cells,
which in turn is causing the reader to read all of the data over again
(which, as you attest takes over an hour).  The Cell Data to Point Data and
Contour filters do this.  If you requested to save ghost cells after the
pvtr reader, then that would also cause a re-read.

-Ken

   ****      Kenneth Moreland
    ***      Sandia National Laboratories
***********  
*** *** ***  email: kmorel at sandia.gov
**  ***  **  phone: (505) 844-8919
    ***      web:   http://www.cs.unm.edu/~kmorel