[vtk-developers] potential speedup for the vtkpolydatatoimagestencil

Mon Jan 23 14:09:50 EST 2012

Hi David,

Thanks for the response, I'll take a look at the clipclosedsurface
class.  That may be what I want anyway (does it allow for any one
plane to have a hollow interior?), but if you've already done the
hashing there, then I'll take a look at porting it to this class as
well.  Or should there be a shared routine that both classes use, if
the work done is very similar?

Mark

On Mon, Jan 23, 2012 at 11:04 AM, David Gobbi <david.gobbi at gmail.com> wrote:
> On Mon, Jan 23, 2012 at 11:19 AM, Mark Roden <mmroden at gmail.com> wrote:
>> Hi David,
>>
>> I'm looking to make the vtkpolydatatoimagestencil class faster.  Right
>> now, on a core i7 machine and in using release-compiled code,
>> translating a body mask in an rtstruct to pixels using this filter is
>> prohibitively expensive in time (upwards of a minute for the mask).
>>
>> There are three possible speedups to do here, in my mind.  I wanted to
>> clear them by you, because I don't want to fork vtk to solve this
>> problem, but the problem has become a serious problem for us.
>>
>> First, treat each plane independently of one another, and then have
>> each plane processed by different threads.  This change is fairly
>> straightforward theoretically, and something you mentioned you were
>> looking into a while back
>> (http://www.vtk.org/pipermail/vtkusers/2011-January/114538.html).  Is
>> this something you're still investigating?
>
> There were three things that I was considering:
> 1) multi-threading so that each CPU gets N slices to work on
> 2) increasing the efficiency of the polygon cutting code, I added some
> nice cutting code to vtkClipClosedSurface and planned to eventually
> also use it in vtkPolyDataToImageStencil
> 3) improving the efficiency or extent insertion for the stencils
>
> So far I've only done #3, which was the least important but was the
> easiest.  Right now my own apps are bottlenecking on my segmentation
> algorithms, rather than on vtkPolyDataToImageStencil, so improving
> vtkPolyDataToImageStencil hasn't been a high priority.
>
>> For my work on other projects, I've found that the intel tbb
>> (http://threadingbuildingblocks.org/) makes multithreading this kind
>> of work very easy; the library is free and works with any c++ project
>> that uses the C++0x standard and can use lambda expressions.
>
> VTK will have to continue to support pre-C++0x compilers for a long
> time, several more years at least.  So if threading is to be done, it
> should be done with VTK's threading classes.
>
>> Second, change the interior while/for loop collision detection to be a
>> hash table.  Right now, in the ThreadedExecute function, there is this
>> code:
>>
>>    for (vtkIdType i = 0; i < numberOfPoints; ++i)
>>      {
>> ...
>>      while( lines->GetNextCell(npts, pointIds) )
>>        {
>>        for (vtkIdType j = 0; j < npts; ++j)
>>          if ( pointIds[j] == i )
>>            {
>>
>> But what if collision detection was changed to uses a hash table where
>> the hashing function automatically detected point collisions through a
>> single pass through the data?
>
> I use a hash vtkClipClosedSurface to accelerate the clipping (i.e. #2
> on my to-do list above).  Take a look at the vtkClipClosedSurface
> code, specifically the vtkCCSEdgeLocator class.
>
>> Third, there does not appear to be an iterator over the points vector,
>> but a Get and Set function.  These functions appear to be pretty slow,
>> and go through several thunking layers before data can actually be set
>> or not.  Is there an iterator class for points as there is for lines?
>> If not, how hard would it be to create such a class?
>
> The vtkPoints::GetData() method returns an array (either a
> vtkFloatArray or a vtkDoubleArray) that contains the points.  Once you
> have this array, the GetTupleValue() method is a purely inlined method
> that can be used to efficiently get the points.  If you know ahead of
> time whether the points are double or float, then this is probably the
> most efficient way of accessing them, apart from getting the raw
> "float *" or "double *".
>
>  - David