[vtk-developers] Garbage collection slowness

David Cole david.cole at kitware.com
Tue May 13 09:58:38 EDT 2008


Would you happen to be measuring these results with a Debug build using the
Microsoft compiler? Or is it some other config?

If Debug/Microsoft, try the Release build... Allocations are tracked by the
Microsoft runtime libraries in Debug builds and they have to do some
tracking on every delete/free to ensure accurate leak reports.

If something else, feel free to ignore this email entirely.


:-)
David


On Tue, May 13, 2008 at 9:47 AM, Hank Childs <childs3 at llnl.gov> wrote:

>
> Hi Berk & Brad,
>
> Thanks very much for your responses; your interest is much appreciated.
>
> To answer one of Brad's question, I am having a problem reproducing the
> problem with a minimal test program, which probably means something.  I will
> continue working on this.
>
> I got interested in the garbage collector because I was doing "poor man's
> profiling" (see footnote [1] below for more explanation) and I kept
> observing that garbage collection was dominating.
>
> To answer Berk's question, yes, I was unclear.  Here is the general setup:
>  We are very memory conscious, so we dereference the inputs to a filter
> after the filter has executed (see footnote [2] below for more explanation).
>  So:
>
> for ( 20000 )
>   filt->SetInput(input[i]);
>   filt->Update();
>   output[i] = filt->GetOutput()->NewInstance();
>   output[i]->ShallowCopy(filt->GetOutput());
>
> for (20000)
>   input[i]->Delete();
>
> I found that the Delete() calls were taking a huge amount of time.  So I
> wrapped them with:
>
> vtkGarbageCollector::Push()
> for (20000)
>   input[i]->Delete();
> vtkGarbageCollector::Pop()
>
> and found that that improved the situation (supporting Brad's claim that
> it should be faster), but it was still taking a long time.  I have also
> found variation in the run times, meaning that my OS may have a role here
> too, likely in terms of lots of small allocations and deallocations.
>
> I don't think that I was very clear in my last email, so I want to try
> again.  My only evidence against garbage collection is that my poor man's
> profiling shows that my program is doing garbage collection the majority of
> the time.
>
> Instead of raising the garbage collection issue, the tack I should have
> taken is to ask why it was taking 47s to delete the output, when it took
> only 20s to create it, as well as executing the filters.  Of course, I would
> greatly help my cause here if I could get a simple reproducer that people
> could sink their teeth into.  So, again, I'll continue pursuing that...
>
> Best,
> Hank
>
> [1] poor man's profiling means connecting with a debugger regularly and
> seeing where the work was happening ... I have had problems getting
> profilers to run on big software projects and I find this to be somewhat
> effective.
>
> [2] As you all know, there is a tradeoff between reusing cached results
> (what VTK does by default) and keeping memory low (what I am doing
> manually).  Of course, VTK does a good job of minimizing the overhead for
> reusing cached results by often sharing references between input and output.
>  Regardless, there is often memory associated with the input that is not
> needed in the output.  For what I'm working on, harvesting that memory is
> worthwhile.  Also, I mitigate the loss of reusing cached results somewhat by
> keeping a cache for all I/O (... and I have found that I/O is often the
> bottleneck).
>
>
> On May 13, 2008, at 6:15 AM, Berk Geveci wrote:
>
>  Hi Hank,
> >
> > Where is the big loop over 20000 items happening? Around the Push/Pop
> > or inside them?
> >
> > -berk
> >
> > On Mon, May 12, 2008 at 6:35 PM, Hank Childs <childs3 at llnl.gov> wrote:
> >
> > >
> > >  Hello VTK Developers!
> > >
> > >  I am running in serial and am setting up about 20000 pipelines on my
> > > serial
> > > process for about 20000 chunks of data.
> > >
> > >  The runtime has gotten disproportionately large with the large number
> > > of
> > > chunks and I believe that garbage collection is at least partly to
> > > blame.
> > >
> > >  For example, if I:
> > >  1) call vtkGarbageCollector::DeferredCollectionPush()
> > >  2) execute three filters (filters that find external faces and remove
> > > ghost
> > > data) and,
> > >  3) call vtkGarbageCollector::DeferredCollectionPop()
> > >
> > >  then: the three filters take about 20s total and the
> > > DeferredCollectionPop
> > > takes about 47s.
> > >
> > >  One conclusion that I drew from the fast execution of the three
> > > filters, is
> > > that iterating through the data is relatively quickly.  Restated, I
> > > ruled
> > > out thrashing through memory as the reason the garbage collector is
> > > taking
> > > 47s.
> > >
> > >  Also, I should disclose that I am managing the execution manually.
> > >  The
> > > best way to describe it would be that I have one instance of filter A,
> > > one
> > > instance of filter B, and one instance of filter C and that I route
> > > all 20K
> > > data sets through filter A, to make 20K new data sets, then route
> > > those 20K
> > > new data sets through B, and so on.  Also, I know that the alternative
> > > is to
> > > call "Update()" 20K times, one for each chunk, but I'd prefer not to
> > > go down
> > > that route, for reasons I can explain if necessary.
> > >
> > >  So: can anyone point me to some words of wisdom about a way to manage
> > > my
> > > data objects so that garbage collection is faster?
> > >
> > >  Best regards,
> > >  Hank_______________________________________________
> > >  vtk-developers mailing list
> > >  vtk-developers at vtk.org
> > >  http://www.vtk.org/mailman/listinfo/vtk-developers
> > >
> > >
> _______________________________________________
> vtk-developers mailing list
> vtk-developers at vtk.org
> http://www.vtk.org/mailman/listinfo/vtk-developers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/vtk-developers/attachments/20080513/3c28dea5/attachment.html>


More information about the vtk-developers mailing list