[Paraview] Serial & Parallel - Scientific Data

Utkarsh Ayachit utkarsh.ayachit at kitware.com
Fri Apr 20 10:45:26 EDT 2018


Leo,

Let's take 1 dataset type/reader type at a time. I am using 5.5 for all
this discussion, just for simplicity.


*1. Structured Grid (*.vts file)*
In this case, the reader supports reading parts of the file on different
ranks, thus even if you have single file, the reader does read sub-extents
on each of the ranks. As an example, I simply loaded the multicomb_0.vts
<https://data.kitware.com/api/v1/item/5ad9fa0f8d777f0685794b15/download> on
multiple ranks, and I see it's indeed split among ranks.


In this case, I believe non-compressed binary file format will be best
since each rank can just read subset of the data its reading. For
compressed formats, you'd need to read the the whole array for
decompression. Having the data in pvts file with separate paritions does
have the benefit of avoiding I/O contention since all ranks won't be
opening the same files. So, I'd suggest experimenting for your intended
scale.

*2. Image Data (*.vti file)*

You refer to structured points. I am not entirely sure of the lineage of
structured points, but looking at the docs seems it's simply a subclass of
vtkImageData, so I'll assume it's indeed that. The same is true for vti
files as vts files. The reader reads subextents on ranks, even if the file
is not split.

*3. Rectilinear Grid (*.vtr file)*

Same is true for vtr files
<https://data.kitware.com/api/v1/item/5ad9fccc8d777f0685794b18/download>
too. The reader can read sub-extents.


Hope that helps.

Utkarsh

On Thu, Apr 19, 2018 at 9:43 PM, Léo Pessanha <leonardopessanha74 at gmail.com>
wrote:
>
> Hi Utkarsh, thank you for the detailed explanation =)
>
> I will definitely try out and see the different performances with
compressed/decompressed data.
>
> The data types are Structured Grid, Structured Points and Rectilinear
Grid. All of them are written in VTK 2.0 version.
>
> When loading in 2 ranks the data is loaded only in 1 rank (ID 0)
consequently having only one color.
>
>
> Em qui, 19 de abr de 2018 22:29, Utkarsh Ayachit <
utkarsh.ayachit at kitware.com> escreveu:
>>>
>>> The options presented when saving data:
>>>
>>>
>>> - Data Mode: ASCII | BINARY | APPENDED
>>> - Encode Appended Data: TRUE | FALSE
>>> - Compressor Type:  NONE | ZLIB
>>>
>>> What do the mean and how they affect performance?
>>
>>
>> These determine how the binary data (your arrays, point coordinates etc)
is saved in the file. ASCII is not to be used to anything but small dataset
for debugging. BINARY saves the data as binary, however it still needs to
encode it (using base64) so that the XML can be parsed correctly.
Generally, you want to pick APPENDED (where the binary data is added at the
end of the XML header(ish) and then can be saved as raw binary dump without
any encoding. No encoding implies you can't easily look at the file in a
ASCII text editor, but will be easier to write and read since neither needs
to do any processing to convert the binary data. For appended APPENDED or
BINARY mode, compressor type allows you to pick a compression technique to
use. This minimizes file size but adds computing cost of compressing
/decompressing. It's a trade off between reading more from  the disk or
spending more time decompressing it after reading form disk. Generally,
reading form disk is slower, so you may still be better off compressing.
5.5 offer better compression options like LZ4 & LZMA which may be worth
exploring too.
>>
>> Before I answer your other questions, what type of data is this? Image
data? Unstructured grid? When you load it on 2 ranks, if you color your
surface by vtkProcessId, is it being read in as paritioned across two ranks
or all of the data is on rank 0?
>>
>> Utkarsh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://public.kitware.com/pipermail/paraview/attachments/20180420/aab70a3f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 38203 bytes
Desc: not available
URL: <https://public.kitware.com/pipermail/paraview/attachments/20180420/aab70a3f/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 19478 bytes
Desc: not available
URL: <https://public.kitware.com/pipermail/paraview/attachments/20180420/aab70a3f/attachment-0001.png>


More information about the ParaView mailing list