[vtk-developers] potential speedup for the vtkpolydatatoimagestencil

Wed Jan 25 18:08:54 EST 2012

OK, so I've looked further into this problem, and it seems that I was
optimizing the wrong thing.

If I comment out everything other than the PolyDataCutter method
(specifically, the call to 'contour' in there), then contours take the
same amount of time to load as they do with everything turned on.  All
of the time is being spent in the Contour routine.

However, rtstructs are already organized into contours.  Each set of
three numbers corresponds to a point in 3D space, and then each
succeeding group corresponds to the next point in the contour.  From
what I can determine, there is no need for this PolyDataCutter
routine.

How can I tell the vtkPolyDataToImageStencil class that the data are
already properly organized, that each point that follows is connected
to the one previous, and in the case of a CLOSED contour, the last
connects to the first?

On Mon, Jan 23, 2012 at 11:27 AM, David Gobbi <david.gobbi at gmail.com> wrote:
> On Mon, Jan 23, 2012 at 12:09 PM, Mark Roden <mmroden at gmail.com> wrote:
>> Hi David,
>>
>> Thanks for the response, I'll take a look at the clipclosedsurface
>> class.  That may be what I want anyway (does it allow for any one
>> plane to have a hollow interior?), but if you've already done the
>> hashing there, then I'll take a look at porting it to this class as
>> well.  Or should there be a shared routine that both classes use, if
>> the work done is very similar?
>
> A shared routine would be ideal, I had hoped to eventually write a
> vtkCutClosedSurface filter for that purpose.
>
> The vtkClipClosedSurface class does allow holes.  It is described
> here: http://vtk.org/Wiki/VTK/Closed_Surface_Clipping
>
>  - David
>
>
>
>> On Mon, Jan 23, 2012 at 11:04 AM, David Gobbi <david.gobbi at gmail.com> wrote:
>>> On Mon, Jan 23, 2012 at 11:19 AM, Mark Roden <mmroden at gmail.com> wrote:
>>>> Hi David,
>>>>
>>>> I'm looking to make the vtkpolydatatoimagestencil class faster.  Right
>>>> now, on a core i7 machine and in using release-compiled code,
>>>> translating a body mask in an rtstruct to pixels using this filter is
>>>> prohibitively expensive in time (upwards of a minute for the mask).
>>>>
>>>> There are three possible speedups to do here, in my mind.  I wanted to
>>>> clear them by you, because I don't want to fork vtk to solve this
>>>> problem, but the problem has become a serious problem for us.
>>>>
>>>> First, treat each plane independently of one another, and then have
>>>> each plane processed by different threads.  This change is fairly
>>>> straightforward theoretically, and something you mentioned you were
>>>> looking into a while back
>>>> (http://www.vtk.org/pipermail/vtkusers/2011-January/114538.html).  Is
>>>> this something you're still investigating?
>>>
>>> There were three things that I was considering:
>>> 1) multi-threading so that each CPU gets N slices to work on
>>> 2) increasing the efficiency of the polygon cutting code, I added some
>>> nice cutting code to vtkClipClosedSurface and planned to eventually
>>> also use it in vtkPolyDataToImageStencil
>>> 3) improving the efficiency or extent insertion for the stencils
>>>
>>> So far I've only done #3, which was the least important but was the
>>> easiest.  Right now my own apps are bottlenecking on my segmentation
>>> algorithms, rather than on vtkPolyDataToImageStencil, so improving
>>> vtkPolyDataToImageStencil hasn't been a high priority.
>>>
>>>> For my work on other projects, I've found that the intel tbb
>>>> (http://threadingbuildingblocks.org/) makes multithreading this kind
>>>> of work very easy; the library is free and works with any c++ project
>>>> that uses the C++0x standard and can use lambda expressions.
>>>
>>> VTK will have to continue to support pre-C++0x compilers for a long
>>> time, several more years at least.  So if threading is to be done, it
>>> should be done with VTK's threading classes.
>>>
>>>> Second, change the interior while/for loop collision detection to be a
>>>> hash table.  Right now, in the ThreadedExecute function, there is this
>>>> code:
>>>>
>>>>    for (vtkIdType i = 0; i < numberOfPoints; ++i)
>>>>      {
>>>> ...
>>>>      while( lines->GetNextCell(npts, pointIds) )
>>>>        {
>>>>        for (vtkIdType j = 0; j < npts; ++j)
>>>>          if ( pointIds[j] == i )
>>>>            {
>>>>
>>>> But what if collision detection was changed to uses a hash table where
>>>> the hashing function automatically detected point collisions through a
>>>> single pass through the data?
>>>
>>> I use a hash vtkClipClosedSurface to accelerate the clipping (i.e. #2
>>> on my to-do list above).  Take a look at the vtkClipClosedSurface
>>> code, specifically the vtkCCSEdgeLocator class.
>>>
>>>> Third, there does not appear to be an iterator over the points vector,
>>>> but a Get and Set function.  These functions appear to be pretty slow,
>>>> and go through several thunking layers before data can actually be set
>>>> or not.  Is there an iterator class for points as there is for lines?
>>>> If not, how hard would it be to create such a class?
>>>
>>> The vtkPoints::GetData() method returns an array (either a
>>> vtkFloatArray or a vtkDoubleArray) that contains the points.  Once you
>>> have this array, the GetTupleValue() method is a purely inlined method
>>> that can be used to efficiently get the points.  If you know ahead of
>>> time whether the points are double or float, then this is probably the
>>> most efficient way of accessing them, apart from getting the raw
>>> "float *" or "double *".
>>>
>>>  - David