[vtk-developers] Follow up on "garbage collection" issue
Hank Childs
childs3 at llnl.gov
Mon Jun 2 14:20:11 EDT 2008
Hi Everyone,
I sent some emails about three weeks ago stating that I was seeing a
significant slow down in my VTK-based app when I had a large number
of grids and I speculated (incorrectly) that it was due to VTK's
garbage collection.
Recall that my test program involved running 15000
vtkRectilinearGrids through 10 filters. Although each filter was
doing the same amount of work, the time to execute the last filter
for the 15000 grids was slowing down dramatically (from 3s for the
first to 52s for the tenth). The reason I was pointing the finger at
garbage collection was because it was intimately connected to the
real culprit: heap management.
I ended up instrumenting the malloc/free calls in my program to
observe the allocation/deallocation patterns and then I created a non-
VTK program that did nothing but mallocs & frees, with the same
pattern. This program ended up exhibiting the same results: fast at
the beginning and very slow at the end. So this basically absolves
VTK. Further, it really only came up on Linux, although I should
point out that Brad King of Kitware did not observe this on his Linux
box, so it is not all Linux boxes.
The issue appears to be with fragmenting the heap with Linux (or,
more correctly, some flavors of Linux). If the heap got too
fragmented, then performance really dropped, especially for the first
few allocations after deallocating a lot of memory.
I downloaded, tcmalloc, Google's memory manager and found that
results got much better. For my VTK-based app, I found that non-I/O
performance of 55K grids went from 420s to 46s. For my toy program,
I found that the time spent for my malloc/free pairs became constant
per iteration.
If anyone out there wants to asses whether or not they are getting
bit by this issue, they can:
1) time their app with and without tcmalloc and see if it improves
or
2) run a toy program I am including with this email. If the toy
program slows down with the later iterations, then your memory
manager is not good with heap fragmentation. Hence, your VTK program
also might be affected. But I think you need to have *lots* of grids/
actors/vtkObjects for this to really come up for your VTK app. You
can run the program by calling "collect_times" which will compile
"toy.C" in various configurations and produce a report.
Finally, I'll mention that I found that there are 319 malloc/free
pairs for a filter update. When amplified by a large number of
filters and iterations, the number of mallocs & frees really added up
(450000 per filter, 4.5M for 10 filters). I'm guessing this means
that a VTK app that has lots of grids is in top 0.0001% in terms of
exercising the heap manager, increasing the likelihood of falling of
the performance cliff. I'm sure there are ways to reduce the number
of malloc/free pairs per filter update, but I'll leave the
speculating up to VTK gurus.
Thanks to everyone for their advice on this issue! I personally am
going to consider this "case closed" and use tcmalloc.
Best regards,
Hank
-------------- next part --------------
A non-text attachment was scrubbed...
Name: collect_times
Type: application/octet-stream
Size: 972 bytes
Desc: not available
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080602/e3da0eb4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: toy.C
Type: application/octet-stream
Size: 4022 bytes
Desc: not available
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080602/e3da0eb4/attachment-0003.obj>
-------------- next part --------------
More information about the vtk-developers
mailing list