[Rtk-users] RTK + Cuda 9 + gcc 6.4 segfaults

Mathis Hoffmann mathis.hoffmann at fau.de
Thu Dec 21 09:20:14 EST 2017


Hello,

I was able to build RTK using Cuda 9 on an Arch Linux machine with gcc 
6.4 with only small modifications:

 #=========================================================
diff --git a/cmake/nvcc-check.cmake b/cmake/nvcc-check.cmake
index c625b421..0e5c34e9 100644
--- a/cmake/nvcc-check.cmake
+++ b/cmake/nvcc-check.cmake
@@ -46,6 +46,9 @@ if(CUDA_FOUND)
   if((CMAKE_SYSTEM_NAME MATCHES "Linux" AND CMAKE_COMPILER_IS_GNUCC) 
OR APPLE)
     # Compatible gcc can be checked in host_config.h
     set(GCC_PATH "")
+    if(${CUDA_VERSION} VERSION_GREATER "8.99")
+      FIND_GCC(GCC_PATH "6" "4")
+    endif()
     if(${CUDA_VERSION} VERSION_GREATER "6.99")
       FIND_GCC(GCC_PATH "4" "9")
     endif()


In addition I had to set CMAKE_CXX_FLAGS to -std=gnu++11 manually in 
the ccmake dialog (unfortunately I was not able to integrate this in 
the top-level CMakeLists so far, CMake simply ignored it there..).

So far, so good. Unfortunately all Cuda tests fail with the same 
segfault. For example, rtkimportcudatest finishes with

****** Case 1: unsigned short ******

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!


****** Case 2: int ******

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!


****** Case 3: float ******

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!


****** Case 4: double ******

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!

Error per Pixel = 0
MSE = 0
PSNR = infdB
QI = 1


Test PASSED!
Segmentation fault (core dumped)

To further investigate what is going wrong here, I ran valgrind. I 
think the most important message is this:

==30633== Process terminating with default action of signal 11 
(SIGSEGV): dumping core
==30633==  Access not within mapped region at address 0x8
==30633==    at 0x12240072: (anonymous 
namespace)::ObjectFactoryBasePrivateInitializer::~ObjectFactoryBasePrivateInitializer() 
(in /home/mathis/dev/RTK/RTK-bin/bin/libRTK.so)
==30633==    by 0x1402B7B1: __cxa_finalize (in /usr/lib/libc-2.26.so)
==30633==    by 0x121771D3: ??? (in 
/home/mathis/dev/RTK/RTK-bin/bin/libRTK.so)
==30633==    by 0x400FB92: _dl_fini (in /usr/lib/ld-2.26.so)
==30633==    by 0x1402B447: __run_exit_handlers (in 
/usr/lib/libc-2.26.so)
==30633==    by 0x1402B499: exit (in /usr/lib/libc-2.26.so)
==30633==    by 0x14014F50: (below main) (in /usr/lib/libc-2.26.so)
==30633==  If you believe this happened as a result of a stack
==30633==  overflow in your program's main thread (unlikely but
==30633==  possible), you can try to increase the size of the
==30633==  main thread stack using the --main-stacksize= flag.
==30633==  The main thread stack size used in this run was 8388608.

I've put the full log here: https://pastebin.com/CBjJmUpB

Does someone have any idea, what is going wrong here?

Thanks for any help!
Mathis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/rtk-users/attachments/20171221/7aac4de3/attachment.html>


More information about the Rtk-users mailing list