[CMake] About the FindCUDA.cmake module ande Separate Compilation

Wed Jul 23 13:45:56 EDT 2014

The CUDA_NVCC_FLAGS variable is a list not a string.  You also have to turn
CUDA_SEPARABLE_COMPILATION on.

Try this

list(APPEND CUDA_NVCC_FLAGS -gencode arch=compute_30,code=sm_30 -rdc=true)
set( CUDA_SEPARABLE_COMPILATION ON)

On Thu, Jul 17, 2014 at 8:07 AM, Notargiacomo Thibault <gnthibault at gmail.com
> wrote:

> Dear All
>
> I am a user of cmake build system and its differents modules, that
> were very helpful in the past.
>
> But I met some specific issues with the FindCUDA.cmake module, for
> about a year now. Especially about the Separate Compilation feature,
> that never worked for me, I  previously had to bypass the problem by
> rewriting some code in the same file, but today I am stuck and I have
> to get this feature working.
>
> What are my files ?
> I have
> ===================
> a.cu:
> __constant__ Buffer float[1024];
> __global__ void kernelA( float a )
> {
>      Buffer[0] = a;
> }
> ===================
> b.cu.h
> extern __constant__ Buffer float[1024];
> ===================
> b.cu
> __global__ void kernelB( float b )
> {
>      Buffer[0] += b;
> }
> ===================
>
> It is obvious with this configuration, that I have to link b.cu with
> a.cu, in order to get the same memory area shared across them. This
> simple feature seems to be only available if a separate compilation
> configuration build is used, in order to avoid redefinition error, and
> also compiler needs relocatable device code to be setted, this latter
> I don't really understand why.
>
>
> What specific feature of the CMake Module I am using ?
> Here are the main macros I am using:
>
> =======================================
> list(APPEND CUDA_NVCC_FLAGS " -gencode arch=compute_30,code=sm_30
> -rdc=true ")
> set( CUDA_SEPARABLE_COMPILATION )
> cuda_add_executable(${OUTPUT_NAME} ${sources} ${headers})
> ======================================
>
>
> The error I get:
> If I discard the "-rdc=true" nvcc option for relocatable code, the
> code compiles, and links fine, but at runtime the code does not work
> as expected, ie the value inside the buffer is not shared across
> differents kernel a and b.
>
> If all options stated before are setted, the code compiles fine, but
> at link step I get tons of link errors that looks like:
>  undefined reference to `__cudaRegisterLinkedBinary[...]'
>
> The problem doesn't seem that hard to solve, as seperate compilation
> is extensively explained in the cuda documentation :
>
> http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda
> But I still got problems trying to get the separate things to work
> with FindCUDA.cmake.
>
> Thank you in advance for any help.
> --
>
> Powered by www.kitware.com
>
> Please keep messages on-topic and check the CMake FAQ at:
> http://www.cmake.org/Wiki/CMake_FAQ
>
> Kitware offers various services to support the CMake community. For more
> information on each offering, please visit:
>
> CMake Support: http://cmake.org/cmake/help/support.html
> CMake Consulting: http://cmake.org/cmake/help/consulting.html
> CMake Training Courses: http://cmake.org/cmake/help/training.html
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Follow this link to subscribe/unsubscribe:
> http://public.kitware.com/mailman/listinfo/cmake
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cmake/attachments/20140723/b226ab0b/attachment-0001.html>