[vtk-developers] vtkPKdTree algorihm

Mon Apr 19 03:30:14 EDT 2010

Thanks Ken, the Floyd and Rivest paper was what I was basically after. 

[If you know of explicit publications of parallelized versions of it, please say. If someone has formally done it, it'd be nice to know]

JB

> -----Original Message-----
> From: Moreland, Kenneth [mailto:kmorel at sandia.gov]
> Sent: 16 April 2010 00:09
> To: Biddiscombe, John A.; vtk-developers at vtk.org
> Subject: Re: [vtk-developers] vtkPKdTree algorihm
> 
> Lee Ann (now Riesen after getting married) left our group many years ago and
> has not looked at the code for a long time.  If you want to email her, her
> address is lriesen at sandia.gov, but she probably won't remember much.
> 
> We wrote a paper on the parallel rendering in ParaView, which used the then
> newly created vtkPKdTree.  Section 4.4 has some scant details on the
> algorithms in that class.  You can get the paper from my web site.
> 
> 
> 
> 	http://www.cs.unm.edu/~kmorel/documents/ParallelVolRen.html
> 
> 
> 
> The overall algorithm is straightforward: iteratively pick axis-aligned
> planes to divide the data.  The most interesting part is how it picks the
> coordinate value on which to split.  It uses a parallelization of the
> "select" algorithm by Floyd and Rivest (referenced in our paper) to find the
> plane that splits cells into two regions with the same number of cells.  The
> select algorithm does this in less than O(n log n) time on average (where n
> is the number of cells).
> 
> I've considered trying to make performance improvements to vtkPKdTree, but
> I've never had the time.  If you are interested, here are some of the areas
> that would have the biggest impact.  Consider each of these prefixed with
> the words "I think" because it has been a while since I have looked at the
> code myself.
> 
> 
> 
> *	The select algorithm's complexity is measured in terms of
> comparisons, but data copies are ignored.  In parallel, the data copies are
> typically expensive whereas comparisons are cheap.  The current parallel
> select algorithm performs several rounds of data transfer before settling on
> the middle value.  You might be able to make it faster holding off any data
> transfer until the end.
> *	At each stage in the algorithm, all processes participate in the
> division of each region, even if those regions have no data involved with
> the region.  It would be much more efficient, for example, if after making
> the first division half the processes recursed in one region while the other
> half recursed in the other region concurrently.  This behavior is probably
> to support the number of regions being created may not be equal to or a
> multiple of the number of processes.  However the most common (probably
> only) real use case is to evenly break up the data for the processes
> generating the tree.
> *	vtkPKdTree is only able to make a number of partitions that is a
> power of 2.  This is because the cells are always divided in half.  However,
> it would be fairly trivial to support any number of partitions.  When
> dividing a region, check to see that the number of remaining regions is
> evenly divisible by 2.  If not, assign a number of regions to each side that
> differ by one, and then select a plane that divides the cells proportionally
> rather than exactly evenly.  This itself won't make vtkPKdTree faster, but
> it is probably necessary to implement the previous suggestion.  It would
> also make D3 work a lot better on non power of 2 process counts.
> *	The select algorithm finds the the exact location to split to get
> even cell counts.  However, I can't think of a real situation where it is
> that important to have exactly the same number of cells in each region.  If
> the select algorithm were only required to get "close enough" to the optimal
> value (say, off by 10%), perhaps several iterations could be shaved off and
> it might go faster.
> 
> 
> 
> -Ken
> 
> 
> On 4/15/10 3:18 AM, "Biddiscombe, John A." <biddisco at cscs.ch> wrote:
> 
> 
> 
> 	Looks as though "Lee Ann Fisk" from sandia was responsible - does
> anyone have a valid email address for her(?)
> 
> 	thanks
> 
> 	JB
> 
> 	> -----Original Message-----
> 	> From: vtk-developers-bounces at vtk.org [mailto:vtk-developers-
> bounces at vtk.org]
> 	> On Behalf Of Biddiscombe, John A.
> 	> Sent: 13 April 2010 09:29
> 	> To: vtk-developers at vtk.org
> 	> Subject: [vtk-developers] vtkPKdTree algorihm
> 	>
> 	> Does anyone on the list know in any detail what algorithm(s) are
> made us of
> 	> within the PKdTree code. I'd like to make customizations of some of
> the
> 	> parallelization aspects, but before I start I'd like to fully
> understand the
> 	> code.
> 	>
> 	>
> 	>
> 	> If there are any references I can look up, it'd help greatly.
> 	>
> 	>
> 	>
> 	> Thanks
> 	>
> 	>
> 	>
> 	> JB
> 	>
> 	>
> 	>
> 	> --
> 	>
> 	> John Biddiscombe,                            email:biddisco @
> cscs.ch
> 	>
> 	> http://www.cscs.ch/ <http://www.cscs.ch/>
> 	>
> 	> CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
> 610.82.07
> 	>
> 	> Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
> 610.82.82
> 	>
> 	>
> 
> 	_______________________________________________
> 	Powered by www.kitware.com
> 
> 	Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
> 
> 	Follow this link to subscribe/unsubscribe:
> 	http://www.vtk.org/mailman/listinfo/vtk-developers
> 
> 
> 
> 
> 
> 
> 
>    ****      Kenneth Moreland
>     ***      Sandia National Laboratories
> ***********
> *** *** ***  email: kmorel at sandia.gov
> **  ***  **  phone: (505) 844-8919
>     ***      web:   http://www.cs.unm.edu/~kmorel
>