[Paraview] 3.98 MPI_Finalize out of order in pvbatch

Kyle Lutz kyle.lutz at kitware.com
Mon Dec 3 10:36:57 EST 2012


Hi Burlen,

On Thu, Nov 29, 2012 at 1:27 PM, Burlen Loring <bloring at lbl.gov> wrote:
> it looks like pvserver is also impacted, hanging after the gui disconnects.
>
>
> On 11/28/2012 12:53 PM, Burlen Loring wrote:
>>
>> Hi All,
>>
>> some parallel tests have been failing for some time on Nautilus.
>> http://open.cdash.org/viewTest.php?onlyfailed&buildid=2684614
>>
>> There are MPI calls made after finalize which cause deadlock issues on SGI
>> MPT. It affects pvbatch for sure. The following snip-it shows the bug, and
>> bug report here: http://paraview.org/Bug/view.php?id=13690
>>
>>
>> //----------------------------------------------------------------------------
>> bool vtkProcessModule::Finalize()
>> {
>>
>>   ...
>>
>>   vtkProcessModule::GlobalController->Finalize(1); <-------mpi_finalize
>> called here

This shouldn't be calling MPI_Finalize() as the finalizedExternally
argument is 1 and in vtkMPIController::Finalize():

    if (finalizedExternally == 0)
      {
      MPI_Finalize();
      }

So my guess is that it's being invoked elsewhere.

>>
>>   ...
>>
>> #ifdef PARAVIEW_USE_MPI
>>   if (vtkProcessModule::FinalizeMPI)
>>     {
>>     MPI_Barrier(MPI_COMM_WORLD); <-------------------------barrier after
>> mpi_finalize
>>     MPI_Finalize(); <--------------------------------------second
>> mpi_finalize
>>     }
>> #endif

I've made a patch which should prevent this second of code from ever
being called twice by setting the FinalizeMPI flag to false after
calling MPI_Finalize(). Can you take a look here:
http://review.source.kitware.com/#/t/1808/ and let me know if that
helps the issue.

Otherwise, would you be able to set a breakpoint on MPI_Finalize() and
get a backtrace of where it gets invoked for the second time? That
would be very helpful in tracking down the problem.

Thanks,
Kyle


More information about the ParaView mailing list