[vtk-developers] Garbage collection slowness

Hank Childs childs3 at llnl.gov
Tue May 13 10:14:29 EDT 2008

Hi All,

I'm clearly not good at giving bug reports.  This was on Linux,  

The good news is that I think I have a reproducer.  I instantiate 25K  
rectilinear grids (each grid 3x3x3) and send them through a  
vtkVectorNorm and then take those outputs and send them through a  
vtkVectorNorm, and so on ten times.

When I do this, the time to delete goes from 1s to 6s and then levels  
off at 6s.  Also the time to calculate the norm appears to be  
increasing dramatically, which I am also observing in my program.   
See the last few "calculating norm" entries ... they have gone up  
from about 3-5s to 30s.

Here's the results, using the vtkTimerLog:
Allocating initial,  0.920138 seconds
Calculating norm,  5.05535 seconds
Delete,  1.0108 seconds
Calculating norm,  3.96532 seconds
Delete,  1.96076 seconds
Calculating norm,  4.01928 seconds
Delete,  3.83726 seconds
Calculating norm,  7.39869 seconds
Delete,  3.68343 seconds
Calculating norm,  6.48079 seconds
Delete,  5.68573 seconds
Calculating norm,  13.278 seconds
Delete,  5.17787 seconds
Calculating norm,  10.486 seconds
Delete,  6.50975 seconds
Calculating norm,  20.208 seconds
Delete,  5.77193 seconds
Calculating norm,  15.6566 seconds
Delete,  7.14988 seconds
Calculating norm,  29.7289 seconds
Delete,  6.56824 seconds

And the program is attached.  Any words of wisdom from the VTK gurus?


On May 13, 2008, at 6:58 AM, David Cole wrote:

> Would you happen to be measuring these results with a Debug build  
> using the Microsoft compiler? Or is it some other config?
> If Debug/Microsoft, try the Release build... Allocations are  
> tracked by the Microsoft runtime libraries in Debug builds and they  
> have to do some tracking on every delete/free to ensure accurate  
> leak reports.
> If something else, feel free to ignore this email entirely.
> :-)
> David
> On Tue, May 13, 2008 at 9:47 AM, Hank Childs <childs3 at llnl.gov> wrote:
> Hi Berk & Brad,
> Thanks very much for your responses; your interest is much  
> appreciated.
> To answer one of Brad's question, I am having a problem reproducing  
> the problem with a minimal test program, which probably means  
> something.  I will continue working on this.
> I got interested in the garbage collector because I was doing "poor  
> man's profiling" (see footnote [1] below for more explanation) and  
> I kept observing that garbage collection was dominating.
> To answer Berk's question, yes, I was unclear.  Here is the general  
> setup:  We are very memory conscious, so we dereference the inputs  
> to a filter after the filter has executed (see footnote [2] below  
> for more explanation).  So:
> for ( 20000 )
>   filt->SetInput(input[i]);
>   filt->Update();
>   output[i] = filt->GetOutput()->NewInstance();
>   output[i]->ShallowCopy(filt->GetOutput());
> for (20000)
>   input[i]->Delete();
> I found that the Delete() calls were taking a huge amount of time.   
> So I wrapped them with:
> vtkGarbageCollector::Push()
> for (20000)
>   input[i]->Delete();
> vtkGarbageCollector::Pop()
> and found that that improved the situation (supporting Brad's claim  
> that it should be faster), but it was still taking a long time.  I  
> have also found variation in the run times, meaning that my OS may  
> have a role here too, likely in terms of lots of small allocations  
> and deallocations.
> I don't think that I was very clear in my last email, so I want to  
> try again.  My only evidence against garbage collection is that my  
> poor man's profiling shows that my program is doing garbage  
> collection the majority of the time.
> Instead of raising the garbage collection issue, the tack I should  
> have taken is to ask why it was taking 47s to delete the output,  
> when it took only 20s to create it, as well as executing the  
> filters.  Of course, I would greatly help my cause here if I could  
> get a simple reproducer that people could sink their teeth into.   
> So, again, I'll continue pursuing that...
> Best,
> Hank
> [1] poor man's profiling means connecting with a debugger regularly  
> and seeing where the work was happening ... I have had problems  
> getting profilers to run on big software projects and I find this  
> to be somewhat effective.
> [2] As you all know, there is a tradeoff between reusing cached  
> results (what VTK does by default) and keeping memory low (what I  
> am doing manually).  Of course, VTK does a good job of minimizing  
> the overhead for reusing cached results by often sharing references  
> between input and output.  Regardless, there is often memory  
> associated with the input that is not needed in the output.  For  
> what I'm working on, harvesting that memory is worthwhile.  Also, I  
> mitigate the loss of reusing cached results somewhat by keeping a  
> cache for all I/O (... and I have found that I/O is often the  
> bottleneck).
> On May 13, 2008, at 6:15 AM, Berk Geveci wrote:
> Hi Hank,
> Where is the big loop over 20000 items happening? Around the Push/Pop
> or inside them?
> -berk
> On Mon, May 12, 2008 at 6:35 PM, Hank Childs <childs3 at llnl.gov> wrote:
>  Hello VTK Developers!
>  I am running in serial and am setting up about 20000 pipelines on  
> my serial
> process for about 20000 chunks of data.
>  The runtime has gotten disproportionately large with the large  
> number of
> chunks and I believe that garbage collection is at least partly to  
> blame.
>  For example, if I:
>  1) call vtkGarbageCollector::DeferredCollectionPush()
>  2) execute three filters (filters that find external faces and  
> remove ghost
> data) and,
>  3) call vtkGarbageCollector::DeferredCollectionPop()
>  then: the three filters take about 20s total and the  
> DeferredCollectionPop
> takes about 47s.
>  One conclusion that I drew from the fast execution of the three  
> filters, is
> that iterating through the data is relatively quickly.  Restated, I  
> ruled
> out thrashing through memory as the reason the garbage collector is  
> taking
> 47s.
>  Also, I should disclose that I am managing the execution  
> manually.  The
> best way to describe it would be that I have one instance of filter  
> A, one
> instance of filter B, and one instance of filter C and that I route  
> all 20K
> data sets through filter A, to make 20K new data sets, then route  
> those 20K
> new data sets through B, and so on.  Also, I know that the  
> alternative is to
> call "Update()" 20K times, one for each chunk, but I'd prefer not  
> to go down
> that route, for reasons I can explain if necessary.
>  So: can anyone point me to some words of wisdom about a way to  
> manage my
> data objects so that garbage collection is faster?
>  Best regards,
>  Hank_______________________________________________
>  vtk-developers mailing list
>  vtk-developers at vtk.org
>  http://www.vtk.org/mailman/listinfo/vtk-developers
> _______________________________________________
> vtk-developers mailing list
> vtk-developers at vtk.org
> http://www.vtk.org/mailman/listinfo/vtk-developers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080513/92babd93/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hrc_slow.C
Type: application/octet-stream
Size: 2070 bytes
Desc: not available
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080513/92babd93/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080513/92babd93/attachment-0001.html>

More information about the vtk-developers mailing list