[Paraview] Problem of Memory distribution

David E DeMarle dave.demarle at kitware.com
Thu Mar 4 13:20:25 EST 2010


Yep, the reader for legacy vtk format is not parallel compliant.

In that case what _should_ happen is that each of the n nodes in your
ad-hoc cluster will read the whole file, and then the pipeline will
crop out on each node the other (n-1)/n'th of the cells. So
temporarily the memory consumption will be very high, but most of that
will be freed right away.

If you can read the file, you could convert it to a parallel compliant
format with paraview by reading it in, then saving the data in either
paraview (pvd), ensight, or exodus format. All of those are better
parallelized and won't have the temporary memory consumption behavior
of the legacy format.

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109



On Thu, Mar 4, 2010 at 12:24 PM, Salman SHAHIDI <salshahidi at gmail.com> wrote:
> The file format is VTK (filename is MyTest129.vtk).
>
> 2010/3/4 David E DeMarle <dave.demarle at kitware.com>
>>
>> ParaView tries hard NOT to ship data over the network, so every
>> machine potentially has to read the whole file.
>>
>> So, if the file format itself isn't partitioned into multiple files
>> (in which case you could possibly get by with putting the right sub
>> file on each disk), and unless you have a shared filesystem,
>> replication onto each disk is your only option.
>>
>> Again, which file format are you reading? What is the filename
>> including the extension.
>>
>> David E DeMarle
>> Kitware, Inc.
>> R&D Engineer
>> 28 Corporate Drive
>> Clifton Park, NY 12065-8662
>> Phone: 518-371-3971 x109
>>
>>
>>
>> On Thu, Mar 4, 2010 at 11:33 AM, Salman SHAHIDI <salshahidi at gmail.com>
>> wrote:
>> > Each time i should copy the same data in all the workstations in order
>> > to
>> > use all them. In your opinion is it the good way?
>> >
>> > 2010/3/4 David E DeMarle <dave.demarle at kitware.com>
>> >>
>> >> An ad-hoc cluster like you've got is fine, as long as you have MPI set
>> >> up on the machines and are running a copy of paraview's pvserver on it
>> >> that has been compiled to use MPI. (Our binaries do not.)
>> >>
>> >> The data type (Unstructured Grid) doesn't matter, I think all VTK data
>> >> structure types can be split up (aka streamed). It is the file format
>> >> (*.vtk, *.vt?, *.xdmf, *.exo, *.case etc) that determines what reader
>> >> is invoked and thus whether the data will be read in in parallel or
>> >> not.
>> >>
>> >> David E DeMarle
>> >> Kitware, Inc.
>> >> R&D Engineer
>> >> 28 Corporate Drive
>> >> Clifton Park, NY 12065-8662
>> >> Phone: 518-371-3971 x109
>> >>
>> >>
>> >>
>> >> On Thu, Mar 4, 2010 at 11:00 AM, Salman SHAHIDI <salshahidi at gmail.com>
>> >> wrote:
>> >> > Thank you David,
>> >> >
>> >> > My dataset is of type: "Unstructured Grid"
>> >> > Otherwise, i have 2 other questions:
>> >> > 1) what are the datasets that the readers are able to break up them?
>> >> > 2) I have not a cluster, thus i copied the same dataset in all WS. Is
>> >> > it
>> >> > the
>> >> > correct manner to have parallel computations?
>> >> >
>> >> > Faithfully yours,
>> >> >
>> >> >      Salman
>> >> >
>> >> > 2010/3/4 David E DeMarle <dave.demarle at kitware.com>
>> >> >>
>> >> >> Which data file format?
>> >> >>
>> >> >> Not all readers are able to break up the data well, in which case
>> >> >> paraview handles it in one of several ways, none of which is ideal.
>> >> >>
>> >> >> David E DeMarle
>> >> >> Kitware, Inc.
>> >> >> R&D Engineer
>> >> >> 28 Corporate Drive
>> >> >> Clifton Park, NY 12065-8662
>> >> >> Phone: 518-371-3971 x109
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thu, Mar 4, 2010 at 9:36 AM, Salman SHAHIDI
>> >> >> <salshahidi at gmail.com>
>> >> >> wrote:
>> >> >> > Hi All,
>> >> >> >
>> >> >> > I have 8 debian workstation (WS) A,B,C,D,E,F,G, and H with
>> >> >> > paraview
>> >> >> > 3.6.1. I
>> >> >> > have configured all them by ssh without need to passeword. Each WS
>> >> >> > has 2
>> >> >> > cores, thus 16 processors are accessible.  In WS A i have a
>> >> >> > machine
>> >> >> > file
>> >> >> > consisting of all the machine names. I have copied the same
>> >> >> > dataset
>> >> >> > in
>> >> >> > all
>> >> >> > the machines too (I am not sure if this is correct or not). The
>> >> >> > problem
>> >> >> > is
>> >> >> > the memory consumption of ParaView. By 8 WS there is not 8 times
>> >> >> > memory
>> >> >> > disponibility. I hoped, when I run on 8 machines, that the memory
>> >> >> > consumption is 1/8 of the size on each machine, than when I use
>> >> >> > only
>> >> >> > one
>> >> >> > machine. So what is the reason for this? Do I need special
>> >> >> > configuration
>> >> >> > to
>> >> >> > minimize memory consumption?
>> >> >> >
>> >> >> > Thank you all,
>> >> >> >
>> >> >> >         Salman
>> >> >> >
>> >> >> > ----------------------------------------
>> >> >> >
>> >> >> > Note:
>> >> >> >
>> >> >> > Command line in the first workstation A:
>> >> >> >
>> >> >> > mpirun --mca btl_tcp_if_include eth0 -machinefile
>> >> >> > mymachinefile.txt
>> >> >> > -np
>> >> >> > 16
>> >> >> > /usr/local/bin/pvserver  --use-offscreen-rendering
>> >> >> > Listen on port: 11111
>> >> >> > Waiting for client...
>> >> >> > Client connected.
>> >> >> >
>> >> >> > Then in paraview executed also in WS A i add a localhost that
>> >> >> > refers
>> >> >> > to
>> >> >> > all
>> >> >> > the 8 servers.
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > Powered by www.kitware.com
>> >> >> >
>> >> >> > Visit other Kitware open-source projects at
>> >> >> > http://www.kitware.com/opensource/opensource.html
>> >> >> >
>> >> >> > Please keep messages on-topic and check the ParaView Wiki at:
>> >> >> > http://paraview.org/Wiki/ParaView
>> >> >> >
>> >> >> > Follow this link to subscribe/unsubscribe:
>> >> >> > http://www.paraview.org/mailman/listinfo/paraview
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>


More information about the ParaView mailing list