[Rtk-users] RTK + Cuda 9 + gcc 6.4 segfaults

Mathis Hoffmann mathis.hoffmann at fau.de
Thu Dec 21 17:30:18 EST 2017


Ok, I was a bit early with the conclusion that this is Cuda related. 
Seems not. The only tests that pass are:
- rtkappsimulatedgeometrytest
- rtkappprojectshepploganphantomtest
- rtkwaveletstest
- rtkargsinfomanagertest

Kind regards
Mathis Hoffmann

On Do, Dez 21, 2017 at 3:50 PM, Simon Rit 
<simon.rit at creatis.insa-lyon.fr> wrote:
> Hi,
> Thanks for the report. Seems like something is happening in the ITK 
> code. Are other non cuda tests failing?
> Did you also compile ITK with C++11? BTW, why is it required?
> Best regards,
> Simon
> 
> On Thu, Dec 21, 2017 at 3:20 PM, Mathis Hoffmann 
> <mathis.hoffmann at fau.de> wrote:
>> Hello,
>> 
>> I was able to build RTK using Cuda 9 on an Arch Linux machine with 
>> gcc 6.4 with only small modifications:
>> 
>>  #=========================================================
>> diff --git a/cmake/nvcc-check.cmake b/cmake/nvcc-check.cmake
>> index c625b421..0e5c34e9 100644
>> --- a/cmake/nvcc-check.cmake
>> +++ b/cmake/nvcc-check.cmake
>> @@ -46,6 +46,9 @@ if(CUDA_FOUND)
>>    if((CMAKE_SYSTEM_NAME MATCHES "Linux" AND 
>> CMAKE_COMPILER_IS_GNUCC) OR APPLE)
>>      # Compatible gcc can be checked in host_config.h
>>      set(GCC_PATH "")
>> +    if(${CUDA_VERSION} VERSION_GREATER "8.99")
>> +      FIND_GCC(GCC_PATH "6" "4")
>> +    endif()
>>      if(${CUDA_VERSION} VERSION_GREATER "6.99")
>>        FIND_GC C(GCC_PATH "4" "9")
>>      endif()
>> 
>> 
>> In addition I had to set CMAKE_CXX_FLAGS to -std=gnu++11 manually in 
>> the ccmake dialog (unfortunately I was not able to integrate this in 
>> the top-level CMakeLists so far, CMake simply ignored it there..).
>> 
>> So far, so good. Unfortunately all Cuda tests fail with the same 
>> segfault. For example, rtkimportcudatest finishes with
>> 
>> ****** Case 1: unsigned short ******
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> 
>> ****** Case 2: int ******
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB< /div>
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> 
>> ****** Case 3: float ******
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> 
>> ****** Case 4: double ******
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> 
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>> 
>> 
>> Test PASSED!
>> Segmentation fault (core dumped)
>> 
>> To further investigate what is going wrong here, I ran valgrind. I 
>> think the most important message is this:
>> 
>> ==30633== Process terminating with default action of signal 11 
>> (SIGSEGV): dumping core
>> ==30633==  Access not within mapped region at address 0x8
>> ==30633==    at 0x12240072: (anonymous 
>> namespace)::ObjectFactoryBasePrivateInitializer::~ObjectFactoryBasePrivateInitializer() 
>> (in /home/mathis/dev/RTK/RTK-bin/bin/libRTK.so)
>> ==30633==    by 0x1402B7B1: __cxa_finalize (in /usr/lib/libc-2.26.so)
>> ==30633==    by 0x121771D3: ??? (in 
>> /home/mathis/dev/RTK/RTK-bin/bin/libRTK.so)
>> ==30633==    by 0x400FB92: _dl_fini (in /usr/lib/ld-2.26.so)
>> ==30633==    by 0x1402B447: __run_exit_handlers (in 
>> /usr/lib/libc-2.26.so)
>> ==30633==    by 0 x1402B499: exit (in /usr/lib/libc-2.26.so)
>> ==30633==    by 0x14014F50: (below main) (in /usr/lib/libc-2.26.so)
>> ==30633==  If you believe this happened as a result of a stack
>> ==30633==  overflow in your program's main thread (unlikely but
>> ==30633==  possible), you can try to increase the size of the
>> ==30633==  main thread stack using the --main-stacksize= flag.
>> ==30633==  The main thread stack size used in this run was 8388608.
>> 
>> I've put the full log here: https://pastebin.com/CBjJmUpB
>> 
>> Does someone have any idea, what is going wrong here?
>> 
>> Thanks for any help!
>> Mathis
>> 
>> _______________________________________________
>> Rtk-users mailing list
>> Rtk-users at public.kitware.com
>> https://public.kitware.com/mailman/listinfo/rtk-users
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/rtk-users/attachments/20171221/c164e34a/attachment.html>


More information about the Rtk-users mailing list