[CMake] Possible bug/incompatibility FindCUDA with Visual Studio 2017

Andrea Borsic aborsic at gmail.com
Fri Aug 11 12:02:39 EDT 2017


Thanks for your pointers, problem solved.

I have upgraded to CMake 3.9.1 (I don't think this matters though) and I 
switched to using the new CUDA CMake support as at your point 3). I am 
also using CUDA 9 RC which supports VS2017 (I was testing under CUDA 8 / 
CUDA 9 earlier, but just forgot to mention this). Now all builds 

I am attaching the updated CMakeLists.txt file for the record.

Thanks, Best Regards,


On 8/10/2017 4:28 PM, Robert Maynard wrote:
> So you are going to have two issues.
> 1. The FindCUDA module has not been updated to handle VS2017. The
> issue is that the VCInstallDir variable now returns a different
> relative path to the compiler than it previously did. If you can
> determine the new logic a patch fixing this behavior be great.
> 2. It doesn't look like CUDA 8.0 supports VS2017 (
> http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#system-requirements
> ).
> 3. You could also look at using the new CMake 3.9 cuda support (
> https://devblogs.nvidia.com/parallelforall/building-cuda-applications-cmake/
> ).
> On Thu, Aug 10, 2017 at 5:49 AM, Andrea Borsic <aborsic at gmail.com> wrote:
>> Hi All,
>> I am working on this platform:
>> Windows 10 64bit
>> Visual Studio 2015 Community Edition
>> Visual Studio 2017 Community Edition
>> CUDA 8.0
>> CMake 3.9
>> I am in the middle of switching from VS2015 to VS2017, but CUDA projects
>> fail to properly compile under VS2017 as the compiler/linker fail to find
>> tools on the path setup by CMake. I believe this is bug/incompatibly of the
>> CMake FindCUDA module with VS2017.
>> To reproduce the problem I am attaching a tiny project
>> main.cu is a minimal CUDA example from the web
>> CMakeLists.txt is a CMake file that leads to a successful build under VS2015
>> and unsuccessful under VS2017
>> Output VS2015 is the output from building the project under VS2015 (all
>> targets built OK)
>> Output VS2017 is the output from building the project under VS2015 (1 target
>> OK one target fails)
>> I have noticed also that oddly under VS2017 an "x64" and "main.dir"
>> directories are created outside the build dir, and at the level of the
>> source directory.
>> I thought of reporting this to the list, and any help is welcome,
>> Thank you and Best Regards,
>> Andrea
>> --
>> Powered by www.kitware.com
>> Please keep messages on-topic and check the CMake FAQ at:
>> http://www.cmake.org/Wiki/CMake_FAQ
>> Kitware offers various services to support the CMake community. For more
>> information on each offering, please visit:
>> CMake Support: http://cmake.org/cmake/help/support.html
>> CMake Consulting: http://cmake.org/cmake/help/consulting.html
>> CMake Training Courses: http://cmake.org/cmake/help/training.html
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>> Follow this link to subscribe/unsubscribe:
>> http://public.kitware.com/mailman/listinfo/cmake

-------------- next part --------------
#include <stdio.h>

// Nearly minimal CUDA example.
// Compile with:
// nvcc -o example example.cu

#define N 1000

// A function marked __global__
// runs on the GPU but can be called from
// the CPU.
// This function multiplies the elements of an array
// of ints by 2.
// The entire computation can be thought of as running
// with one thread per array element with blockIdx.x
// identifying the thread.
// The comparison i<N is because often it isn't convenient
// to have an exact 1-1 correspondence between threads
// and array elements. Not strictly necessary here.
// Note how we're mixing GPU and CPU code in the same source
// file. An alternative way to use CUDA is to keep
// C/C++ code separate from CUDA code and dynamically
// compile and load the CUDA code at runtime, a little
// like how you compile and load OpenGL shaders from
// C/C++ code.
void add(int *a, int *b) {
    int i = blockIdx.x;
    if (i<N) {
        b[i] = 2*a[i];

int main() {
    // Create int arrays on the CPU.
    // ('h' stands for "host".)
    int ha[N], hb[N];

    // Create corresponding int arrays on the GPU.
    // ('d' stands for "device".)
    int *da, *db;
    cudaMalloc((void **)&da, N*sizeof(int));
    cudaMalloc((void **)&db, N*sizeof(int));

    // Initialise the input data on the CPU.
    for (int i = 0; i<N; ++i) {
        ha[i] = i;

    // Copy input data to array on GPU.
    cudaMemcpy(da, ha, N*sizeof(int), cudaMemcpyHostToDevice);

    // Launch GPU code with N threads, one per
    // array element.
    add<<<N, 1>>>(da, db);

    // Copy output array from GPU back to CPU.
    cudaMemcpy(hb, db, N*sizeof(int), cudaMemcpyDeviceToHost);

    for (int i = 0; i<N; ++i) {
        printf("%d\n", hb[i]);

    // Free up the arrays on the GPU.

    return 0;
-------------- next part --------------
# minimal example based on https://devblogs.nvidia.com/parallelforall/building-cuda-applications-cmake/

cmake_minimum_required(VERSION 3.9 FATAL_ERROR) # this is required for the new CUDA support module

project (TestProject LANGUAGES CXX CUDA)

add_executable(TestExecutable main.cu)

target_compile_features(TestExecutable PUBLIC cxx_std_11)

set_target_properties(TestExecutable PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

SET(CUDA_NVCC_FLAGS "-arch=sm_35" CACHE STRING "nvcc flags" FORCE)

More information about the CMake mailing list