[Rtk-users] RTK + Cuda 9 + gcc 6.4 segfaults

Simon Rit simon.rit at creatis.insa-lyon.fr
Fri Dec 22 07:44:23 EST 2017


Hi,
I tried to reproduce the bug but I couldn't. See
http://my.cdash.org/buildSummary.php?buildid=1341867
Can you check if there is a difference in config? Are you compiling ITK
with shared libs? If yes, are you sure RTK uses the correct ones (check ldd
failing_binary)?
Otherwise, no clue what's the problem here.
Simon

On Thu, Dec 21, 2017 at 11:30 PM, Mathis Hoffmann <mathis.hoffmann at fau.de>
wrote:

> Ok, I was a bit early with the conclusion that this is Cuda related. Seems
> not. The only tests that pass are:
> - rtkappsimulatedgeometrytest
> - rtkappprojectshepploganphantomtest
> - rtkwaveletstest
> - rtkargsinfomanagertest
>
> Kind regards
> Mathis Hoffmann
>
> On Do, Dez 21, 2017 at 3:50 PM, Simon Rit <simon.rit at creatis.insa-lyon.fr>
> wrote:
>
> Hi,
> Thanks for the report. Seems like something is happening in the ITK code.
> Are other non cuda tests failing?
> Did you also compile ITK with C++11? BTW, why is it required?
> Best regards,
> Simon
>
> On Thu, Dec 21, 2017 at 3:20 PM, Mathis Hoffmann
> <mathis.hoffmann at fau.de>< /span> wrote:
>
>> Hello,
>>
>> I was able to build RTK using Cuda 9 on an Arch Linux machine with gcc
>> 6.4 with only small modifications:
>>
>>  #=========================================================
>> diff --git a/cmake/nvcc-check.cmake b/cmake/nvcc-check.cmake
>> index c625b421..0e5c34e9 100644
>> --- a/cmake/nvcc-check.cmake
>> +++ b/cmake/nvcc-check.cmake
>> @@ -46,6 +46,9 @@ if(CUDA_FOUND)
>>    if((CMAKE_SYSTEM_NAME MATCHES "Linux" AND CMAKE_COMPILER_IS_GNUCC) OR
>> APPLE)
>>      # Compatible gcc can be checked in host_config.h
>>      set(GCC_PATH "")
>> +    if(${CUDA_VERSION} VERSION_GREATER "8.99")
>> +      FIND_GCC(GCC_PATH "6" "4")
>> +    endif()
>>      if(${CUDA_VERSION} VERSION_GREATER "6.99")
>>        FIND_GC C(GCC_PATH "4" "9")
>>      endif()
>>
>>
>> In addition I had to set CMAKE_CXX_FLAGS to -std=gnu++11 manually in the
>> ccmake dialog (unfortunately I was not able to integrate this in the
>> top-level CMakeLists so far, CMake simply ignored it there..).
>>
>> So far, so good. Unfortunately all Cuda tests fail with the same
>> segfault. For example, rtkimportcudatest finishes with
>>
>> ****** Case 1: unsigned short ******
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>>
>> ****** Case 2: int ******
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB& lt; /div>
>> QI = 1
>>
>>
>> Test PASSED!
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>>
>> ****** Case 3: float ******
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>>
>> ****** Case 4: double ******
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> Test PASSED!
>>
>> Error per Pixel = 0
>> MSE = 0
>> PSNR = infdB
>> QI = 1
>>
>>
>> < u>
>> Test PASSED!
>> Segmentation fault (core dumped)
>>
>> To further investigate what is going wrong here, I ran valgrind. I think
>> the most important message is this:
>>
>> ==30633== Process terminating with default action of signal 11 (SIGSEGV):
>> dumping core
>> ==30633==  Access not within mapped region at address 0x8
>> ==30633==    at 0x12240072: (anonymous namespace)::ObjectFactoryBaseP
>> rivateInitializer::~ObjectFactoryBasePrivateInitializer() (in
>> /home/mathis/dev/RTK/RTK-bin/bin/libRTK.so)
>> ==30633==    by 0x1402B7B1: __cxa_finalize (in /usr/lib/libc-2.26.so)
>> ==30633==    by 0x121771D3: ??? (in /home/mathis/dev/RTK/RTK-bin/b
>> in/libRTK.so)
>> ==30633==    by 0x400FB92: _dl_fini (in /usr/lib/ld-2.26.so)< /div>
>> ==30633==    by 0x1402B447: __run_exit_handlers (in /usr/lib/libc-2.26.so
>> )
>> ==30633==    by 0 x1402B499: exit (in /usr/lib/libc-2.26.so)
>> ==30633==    by 0x14014F50: (below main) (in /usr/lib/libc-2.26.so)
>> ==30633==  If you believe this happened as a result of a stack
>> ==30633==  overflow in your program's main thread (unlikely but
>> ==30633==  possible), you can try to increase the size of the
>> ==30633==  main thread stack using the --main-stacksize= flag.
>> ==30633==  The main thread stack size used in this run was 8388608.
>>
>> I've put the full log here: https://pastebin.com/CBjJmUpB
>>
>> Does someone have any idea, what is going wrong here?
>>
>> Thanks for any help!
>> Mathis
>>
>> _______________________________________________
>> Rtk-users mailing list
>> Rtk-users at public.kitware.com
>> https://public.kitware.com/mailman/listinfo/rtk-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/rtk-users/attachments/20171222/0f3b8ad2/attachment-0001.html>


More information about the Rtk-users mailing list