[vtk-developers] Garbage collection slowness

Hank Childs childs3 at llnl.gov
Wed May 14 13:05:02 EDT 2008


Hi Brad,

I really appreciate you taking some time to take a look.

It looks like it made the deletes better.  But it still seems like  
there is something really wrong going on ... but you can only see it  
if you go past j=0.  I changed the "amt" from 25K to 10K and kept j  
at 10.

With your changes, the last iteration of the norm filter takes  
52.3s.  I believe it should only take as long as the first iteration,  
which is ~3s.  Without your changes, the last iteration takes 17.0s  
(not quite as bad, but there is still a problem).  I wonder if this  
is related to all of the vtkInformation's floating around.  I tried  
taking them out, but that led to crashes, warnings, etc.

Any thoughts?

Here are the timings:

<with trivial producer>
Allocating initial,  0.582522 seconds
Calculating norm,  3.01439 seconds
Delete,  0.26264 seconds
Calculating norm,  3.65343 seconds
Delete,  0.329396 seconds
Calculating norm,  3.72347 seconds
Delete,  0.285982 seconds
Calculating norm,  6.06515 seconds
Delete,  0.345439 seconds
Calculating norm,  10.8389 seconds
Delete,  0.379462 seconds
Calculating norm,  12.9148 seconds
Delete,  0.449169 seconds
Calculating norm,  28.4741 seconds
Delete,  0.417562 seconds
Calculating norm,  31.4917 seconds
Delete,  0.44765 seconds
Calculating norm,  48.0374 seconds
Delete,  0.474605 seconds
Calculating norm,  52.2953 seconds
Delete,  0.483139 seconds

<original>
Allocating initial,  0.580015 seconds
Calculating norm,  3.35481 seconds
Delete,  0.555838 seconds
Calculating norm,  2.61965 seconds
Delete,  2.03697 seconds
Calculating norm,  3.1105 seconds
Delete,  2.00574 seconds
Calculating norm,  4.75912 seconds
Delete,  1.58643 seconds
Calculating norm,  3.16867 seconds
Delete,  2.31494 seconds
Calculating norm,  8.25029 seconds
Delete,  2.20463 seconds
Calculating norm,  4.12523 seconds
Delete,  2.60801 seconds
Calculating norm,  12.5749 seconds
Delete,  2.38863 seconds
Calculating norm,  6.50348 seconds
Delete,  2.68022 seconds
Calculating norm,  17.0695 seconds
Delete,  2.5792 seconds

Best,
Hank


On May 14, 2008, at 7:25 AM, Brad King wrote:

> Hank Childs wrote:
>> I ran the program I posted through "quantify", the profiling  
>> cousin of
>> purify.  It looks like 48% of the cycles are spent in
>> vtkGarbageCollector::Collect(vtkObjectBase*).
>
> Okay, the GC is spending its time exploring the tiny reference  
> graph of
> each of the 25000 inputs.  The deferred collection will not help with
> this, and in my tests is actually slightly slower probably due to
> locality of reference problems.
>
> I do have a fix for your case though.  See the attached modified  
> version
> of your example.  Here is what happens:
>
> When you write
>
>   norm->SetInput(outputs[i]);
>
> the outputs[i] object is a stand-alone vtkDataObject with no producer.
> All filter inputs are required to have a producer to make a valid
> pipeline.  When there is no producer VTK creates a  
> "vtkTrivialProducer"
> automatically.  It comes with an executive, pipeline information  
> object,
> and a couple other pieces that all go into a small strongly connected
> component of the reference graph.  Since you're doing that inside a  
> loop
> a separate such block of objects gets created for all 25K objects.
>
> Instead you can manually create one vtkTrivialProducer and re-use  
> it for
> every input.  That greatly reduces the amount of work you're asking  
> the
> GC to do.  The attached code demonstrates this.  You can switch  
> between
> the old and new code by commenting out the #define USE_TP line.
>
> I changed the "j" loop to only one iteration and produced these  
> timings:
>
> #undef USE_TP
>
>   Allocating initial,  3.52777 seconds
>   Calculating norm,  23.073 seconds
>   Delete,  4.05667 seconds
>
> #define USE_TP
>
>   Allocating initial,  3.54613 seconds
>   Calculating norm,  20.476 seconds
>   Delete,  1.00634 seconds
>
>> So, Berk: if I'm right that garbage collection is dominating, then  
>> the
>> gameplan you're suggesting would be to reimplement Unregister()  
>> for each
>> of the derived types of vtkDataObject to do their business without
>> involving garbage collection?  Is that right?
>
> We will separately investigate this.  I don't think that solution  
> is as
> simple as it seems.
>
> -Brad
>
> #include <vtkPointData.h>
> #include <vtkRectilinearGrid.h>
> #include <vtkPolyData.h>
> #include <vtkFloatArray.h>
> #include <vtkTimerLog.h>
> #include <vtkGeometryFilter.h>
> #include <vtkVectorNorm.h>
> #include <vtkTrivialProducer.h>
>
> #define USE_TP
>
> int main()
> {
>     int  i;
>     const int grid_dim = 3;
>
>     const int amt = 25000;
>     vtkTrivialProducer* tp = vtkTrivialProducer::New();
>     vtkRectilinearGrid **rgrids = new vtkRectilinearGrid*[amt];
>     vtkTimerLog::MarkStartEvent("Allocating initial");
>     vtkFloatArray *vec = vtkFloatArray::New();
>     vec->SetNumberOfComponents(3);
>     vec->SetNumberOfTuples(grid_dim*grid_dim*grid_dim);
>     for (i = 0 ; i < amt ; i++)
>     {
>          rgrids[i] = vtkRectilinearGrid::New();
>          vtkFloatArray *arr = vtkFloatArray::New();
>          arr->SetNumberOfTuples(grid_dim);
>          rgrids[i]->SetXCoordinates(arr);
>          rgrids[i]->SetYCoordinates(arr);
>          rgrids[i]->SetZCoordinates(arr);
>          rgrids[i]->SetDimensions(grid_dim, grid_dim, grid_dim);
>          rgrids[i]->GetPointData()->SetVectors(vec);
>          arr->Delete();
>     }
>     vec->Delete();
>     vtkTimerLog::MarkEndEvent("Allocating");
>
>     vtkDataSet **outputs = new vtkDataSet*[amt];
>     vtkDataSet **outputs2 = new vtkDataSet*[amt];
>     for (i = 0 ; i < amt ;i++)
>         outputs[i] = rgrids[i];
>
>     for (int j = 0 ; j < 10 ; j++)
>     {
>         vtkTimerLog::MarkStartEvent("Calculating norm");
>         vtkVectorNorm *norm = vtkVectorNorm::New();
>         norm->SetInputConnection(tp->GetOutputPort());
>         for (i = 0 ; i < amt ; i++)
>         {
>             if (i % 5000 == 0)
>                 cerr << "Doing " << i << endl;
> #ifdef USE_TP
>             tp->SetOutput(outputs[i]);
> #else
>             norm->SetInput(outputs[i]);
> #endif
>             norm->Update();
>             outputs2[i] = norm->GetOutput()->NewInstance();
>             outputs2[i]->ShallowCopy(norm->GetOutput());
>         }
> #ifdef USE_TP
>             tp->SetOutput(0);
> #endif
>         vtkTimerLog::MarkEndEvent("Calculating norm");
>         vtkTimerLog::MarkStartEvent("Delete");
>         for (i = 0 ;i < amt ; i++)
>         {
>             outputs[i]->Delete();
>             outputs[i] = outputs2[i];
>         }
>         vtkTimerLog::MarkEndEvent("Delete");
>     }
>
>     ofstream ofile("timings");
>     vtkTimerLog::DumpLogWithIndents( &ofile, 0);
>     tp->Delete();
> }
>




More information about the vtk-developers mailing list