[vtk-developers] Follow up on "garbage collection" issue

Hank Childs childs3 at llnl.gov
Mon Jun 2 14:20:11 EDT 2008


Hi Everyone,

I sent some emails about three weeks ago stating that I was seeing a  
significant slow down in my VTK-based app when I had a large number  
of grids and I speculated (incorrectly) that it was due to VTK's  
garbage collection.

Recall that my test program involved running 15000  
vtkRectilinearGrids through 10 filters.  Although each filter was  
doing the same amount of work, the time to execute the last filter  
for the 15000 grids was slowing down dramatically (from 3s for the  
first to 52s for the tenth).  The reason I was pointing the finger at  
garbage collection was because it was intimately connected to the  
real culprit: heap management.

I ended up instrumenting the malloc/free calls in my program to  
observe the allocation/deallocation patterns and then I created a non- 
VTK program that did nothing but mallocs & frees, with the same  
pattern.  This program ended up exhibiting the same results: fast at  
the beginning and very slow at the end.  So this basically absolves  
VTK.  Further, it really only came up on Linux, although I should  
point out that Brad King of Kitware did not observe this on his Linux  
box, so it is not all Linux boxes.

The issue appears to be with fragmenting the heap with Linux (or,  
more correctly, some flavors of Linux).  If the heap got too  
fragmented, then performance really dropped, especially for the first  
few allocations after deallocating a lot of memory.

I downloaded, tcmalloc, Google's memory manager and found that  
results got much better.  For my VTK-based app, I found that non-I/O  
performance of 55K grids went from 420s to 46s.  For my toy program,  
I found that the time spent for my malloc/free pairs became constant  
per iteration.

If anyone out there wants to asses whether or not they are getting  
bit by this issue, they can:
1) time their app with and without tcmalloc and see if it improves
or
2) run a toy program I am including with this email.  If the toy  
program slows down with the later iterations, then your memory  
manager is not good with heap fragmentation.  Hence, your VTK program  
also might be affected.  But I think you need to have *lots* of grids/ 
actors/vtkObjects for this to really come up for your VTK app.  You  
can run the program by calling "collect_times" which will compile  
"toy.C" in various configurations and produce a report.

Finally, I'll mention that I found that there are 319 malloc/free  
pairs for a filter update.  When amplified by a large number of  
filters and iterations, the number of mallocs & frees really added up  
(450000 per filter, 4.5M for 10 filters).  I'm guessing this means  
that a VTK app that has lots of grids is in top 0.0001% in terms of  
exercising the heap manager, increasing the likelihood of falling of  
the performance cliff.  I'm sure there are ways to reduce the number  
of malloc/free pairs per filter update, but I'll leave the  
speculating up to VTK gurus.

Thanks to everyone for their advice on this issue!  I personally am  
going to consider this "case closed" and use tcmalloc.

Best regards,
Hank

-------------- next part --------------
A non-text attachment was scrubbed...
Name: collect_times
Type: application/octet-stream
Size: 972 bytes
Desc: not available
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080602/e3da0eb4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: toy.C
Type: application/octet-stream
Size: 4022 bytes
Desc: not available
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080602/e3da0eb4/attachment-0003.obj>
-------------- next part --------------




More information about the vtk-developers mailing list