[vtk-developers] vtkPKdTree algorihm
kmorel at sandia.gov
Thu Apr 15 18:08:45 EDT 2010
Lee Ann (now Riesen after getting married) left our group many years ago and has not looked at the code for a long time. If you want to email her, her address is lriesen at sandia.gov, but she probably won't remember much.
We wrote a paper on the parallel rendering in ParaView, which used the then newly created vtkPKdTree. Section 4.4 has some scant details on the algorithms in that class. You can get the paper from my web site.
The overall algorithm is straightforward: iteratively pick axis-aligned planes to divide the data. The most interesting part is how it picks the coordinate value on which to split. It uses a parallelization of the "select" algorithm by Floyd and Rivest (referenced in our paper) to find the plane that splits cells into two regions with the same number of cells. The select algorithm does this in less than O(n log n) time on average (where n is the number of cells).
I've considered trying to make performance improvements to vtkPKdTree, but I've never had the time. If you are interested, here are some of the areas that would have the biggest impact. Consider each of these prefixed with the words "I think" because it has been a while since I have looked at the code myself.
* The select algorithm's complexity is measured in terms of comparisons, but data copies are ignored. In parallel, the data copies are typically expensive whereas comparisons are cheap. The current parallel select algorithm performs several rounds of data transfer before settling on the middle value. You might be able to make it faster holding off any data transfer until the end.
* At each stage in the algorithm, all processes participate in the division of each region, even if those regions have no data involved with the region. It would be much more efficient, for example, if after making the first division half the processes recursed in one region while the other half recursed in the other region concurrently. This behavior is probably to support the number of regions being created may not be equal to or a multiple of the number of processes. However the most common (probably only) real use case is to evenly break up the data for the processes generating the tree.
* vtkPKdTree is only able to make a number of partitions that is a power of 2. This is because the cells are always divided in half. However, it would be fairly trivial to support any number of partitions. When dividing a region, check to see that the number of remaining regions is evenly divisible by 2. If not, assign a number of regions to each side that differ by one, and then select a plane that divides the cells proportionally rather than exactly evenly. This itself won't make vtkPKdTree faster, but it is probably necessary to implement the previous suggestion. It would also make D3 work a lot better on non power of 2 process counts.
* The select algorithm finds the the exact location to split to get even cell counts. However, I can't think of a real situation where it is that important to have exactly the same number of cells in each region. If the select algorithm were only required to get "close enough" to the optimal value (say, off by 10%), perhaps several iterations could be shaved off and it might go faster.
On 4/15/10 3:18 AM, "Biddiscombe, John A." <biddisco at cscs.ch> wrote:
Looks as though "Lee Ann Fisk" from sandia was responsible - does anyone have a valid email address for her(?)
> -----Original Message-----
> From: vtk-developers-bounces at vtk.org [mailto:vtk-developers-bounces at vtk.org]
> On Behalf Of Biddiscombe, John A.
> Sent: 13 April 2010 09:29
> To: vtk-developers at vtk.org
> Subject: [vtk-developers] vtkPKdTree algorihm
> Does anyone on the list know in any detail what algorithm(s) are made us of
> within the PKdTree code. I'd like to make customizations of some of the
> parallelization aspects, but before I start I'd like to fully understand the
> If there are any references I can look up, it'd help greatly.
> John Biddiscombe, email:biddisco @ cscs.ch
> http://www.cscs.ch/ <http://www.cscs.ch/>
> CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
> Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82
Powered by www.kitware.com
Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
Follow this link to subscribe/unsubscribe:
**** Kenneth Moreland
*** Sandia National Laboratories
*** *** *** email: kmorel at sandia.gov
** *** ** phone: (505) 844-8919
*** web: http://www.cs.unm.edu/~kmorel
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the vtk-developers