[vtk-developers] Garbage collection slowness

Tue May 13 09:47:19 EDT 2008

Hi Berk & Brad,

Thanks very much for your responses; your interest is much appreciated.

To answer one of Brad's question, I am having a problem reproducing  
the problem with a minimal test program, which probably means  
something.  I will continue working on this.

I got interested in the garbage collector because I was doing "poor  
man's profiling" (see footnote [1] below for more explanation) and I  
kept observing that garbage collection was dominating.

To answer Berk's question, yes, I was unclear.  Here is the general  
setup:  We are very memory conscious, so we dereference the inputs to  
a filter after the filter has executed (see footnote [2] below for  
more explanation).  So:

for ( 20000 )
    filt->SetInput(input[i]);
    filt->Update();
    output[i] = filt->GetOutput()->NewInstance();
    output[i]->ShallowCopy(filt->GetOutput());

for (20000)
    input[i]->Delete();

I found that the Delete() calls were taking a huge amount of time.   
So I wrapped them with:

vtkGarbageCollector::Push()
for (20000)
    input[i]->Delete();
vtkGarbageCollector::Pop()

and found that that improved the situation (supporting Brad's claim  
that it should be faster), but it was still taking a long time.  I  
have also found variation in the run times, meaning that my OS may  
have a role here too, likely in terms of lots of small allocations  
and deallocations.

I don't think that I was very clear in my last email, so I want to  
try again.  My only evidence against garbage collection is that my  
poor man's profiling shows that my program is doing garbage  
collection the majority of the time.

Instead of raising the garbage collection issue, the tack I should  
have taken is to ask why it was taking 47s to delete the output, when  
it took only 20s to create it, as well as executing the filters.  Of  
course, I would greatly help my cause here if I could get a simple  
reproducer that people could sink their teeth into.  So, again, I'll  
continue pursuing that...

Best,
Hank

[1] poor man's profiling means connecting with a debugger regularly  
and seeing where the work was happening ... I have had problems  
getting profilers to run on big software projects and I find this to  
be somewhat effective.

[2] As you all know, there is a tradeoff between reusing cached  
results (what VTK does by default) and keeping memory low (what I am  
doing manually).  Of course, VTK does a good job of minimizing the  
overhead for reusing cached results by often sharing references  
between input and output.  Regardless, there is often memory  
associated with the input that is not needed in the output.  For what  
I'm working on, harvesting that memory is worthwhile.  Also, I  
mitigate the loss of reusing cached results somewhat by keeping a  
cache for all I/O (... and I have found that I/O is often the  
bottleneck).

On May 13, 2008, at 6:15 AM, Berk Geveci wrote:

> Hi Hank,
>
> Where is the big loop over 20000 items happening? Around the Push/Pop
> or inside them?
>
> -berk
>
> On Mon, May 12, 2008 at 6:35 PM, Hank Childs <childs3 at llnl.gov> wrote:
>>
>>  Hello VTK Developers!
>>
>>  I am running in serial and am setting up about 20000 pipelines on  
>> my serial
>> process for about 20000 chunks of data.
>>
>>  The runtime has gotten disproportionately large with the large  
>> number of
>> chunks and I believe that garbage collection is at least partly to  
>> blame.
>>
>>  For example, if I:
>>  1) call vtkGarbageCollector::DeferredCollectionPush()
>>  2) execute three filters (filters that find external faces and  
>> remove ghost
>> data) and,
>>  3) call vtkGarbageCollector::DeferredCollectionPop()
>>
>>  then: the three filters take about 20s total and the  
>> DeferredCollectionPop
>> takes about 47s.
>>
>>  One conclusion that I drew from the fast execution of the three  
>> filters, is
>> that iterating through the data is relatively quickly.  Restated,  
>> I ruled
>> out thrashing through memory as the reason the garbage collector  
>> is taking
>> 47s.
>>
>>  Also, I should disclose that I am managing the execution  
>> manually.  The
>> best way to describe it would be that I have one instance of  
>> filter A, one
>> instance of filter B, and one instance of filter C and that I  
>> route all 20K
>> data sets through filter A, to make 20K new data sets, then route  
>> those 20K
>> new data sets through B, and so on.  Also, I know that the  
>> alternative is to
>> call "Update()" 20K times, one for each chunk, but I'd prefer not  
>> to go down
>> that route, for reasons I can explain if necessary.
>>
>>  So: can anyone point me to some words of wisdom about a way to  
>> manage my
>> data objects so that garbage collection is faster?
>>
>>  Best regards,
>>  Hank_______________________________________________
>>  vtk-developers mailing list
>>  vtk-developers at vtk.org
>>  http://www.vtk.org/mailman/listinfo/vtk-developers
>>