View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0011767CMakeModulespublic2011-01-26 14:442016-06-10 14:31
ReporterFlorian Rathgeber 
Assigned ToKitware Robot 
PrioritynormalSeverityfeatureReproducibilityhave not tried
StatusclosedResolutionmoved 
PlatformOSOS Version
Product VersionCMake 2.8.3 
Target VersionFixed in Version 
Summary0011767: Auto-detect CUDA-capable GPUs present and their compute capability in FindCUDA.cmake
DescriptionThe attached patch implements the auto-detection of CUDA-capable GPUs present in the system and their compute capability in FindCUDA.cmake. Furthermore the --generate-code flag for nvcc is set to generate code for the detected architecture.

Apply the patch in the root of a FindCUDA.cmake SVN checkout (r1192).
TagsNo tags attached.
Attached Filespatch file icon FindCUDA_check_compute_capability.patch [^] (3,886 bytes) 2011-01-26 14:44 [Show Content]

 Relationships

  Notes
(0025096)
Alexey Ozeritsky (reporter)
2011-01-26 15:25

What to do if build machine and target platform have different hardware ?
I think this feature should be optional.
(0025099)
James Bigler (developer)
2011-01-26 15:46

Thanks for the suggestion and patch.

I agree that this should be an optional function that users could call to generate the appropriate flags which then could be appended to CUDA_NVCC_FLAGS.

One thing to note is that if you have a sm_1x and sm_2x device, you can't use the same cubin to run on both unless you generate both sm_1x and sm_2x device code, or at least generated PTX into the CUBIN that can be compiled to the device's specific compute capability. See section 1.3.1 in the Fermi_Compatibility_Guide.pdf in the CUDA Toolkit on how to specify CUBIN and PTX generation in the code.

I'm also not sure what is gained by this. Either your code requires a specific feature (say sm_11 for 32 bit atomics, sm_12 for 64 bit atomics and sm_20 for floating point atomics) or it doesn't and you should be fine with specifying the default arguments to nvcc. In addition, you could use -arch sm_11 which will generate both sm_11 CUBIN and sm_11 PTX which can be JIT compiled by the driver to whatever device you are using. Are you trying to avoid this JIT cost?
(0025103)
Florian Rathgeber (reporter)
2011-01-26 18:54

My assumption was that most users will want to have CUBIN code optimized for their respective device. I agree this may not always be useful and hence not a suitable default. Setting the compiler flags optionally is more reasonable.

The patch allows manually checking for the compute capability and whether it's sufficient w.r.t. features used in the project. A possible extension is the option to request a minimum compute capability (much like a required package version).

Using feature such as atomics without specifying the necessary code generation flags (even though the hardware would support it) will produce compiler errors (e.g. identifier "atomicCAS" is undefined) that may be surprising.
(0030223)
David Cole (manager)
2012-08-11 11:09

Sending old, never assigned issues to the backlog.

(The age of the bug, plus the fact that it's never been assigned to anyone means that nobody is actively working on it...)

If an issue you care about is sent to the backlog when you feel it should have been addressed in a different manner, please bring it up on the CMake mailing list for discussion. Sign up for the mailing list here, if you're not already on it: http://www.cmake.org/mailman/listinfo/cmake [^]

It's easy to re-activate a bug here if you can find a CMake developer who has the bandwidth to take it on, and ferry a fix through to our 'next' branch for dashboard testing.
(0041788)
Kitware Robot (administrator)
2016-06-10 14:28

Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.

 Issue History
Date Modified Username Field Change
2011-01-26 14:44 Florian Rathgeber New Issue
2011-01-26 14:44 Florian Rathgeber File Added: FindCUDA_check_compute_capability.patch
2011-01-26 15:25 Alexey Ozeritsky Note Added: 0025096
2011-01-26 15:46 James Bigler Note Added: 0025099
2011-01-26 18:54 Florian Rathgeber Note Added: 0025103
2012-08-11 11:09 David Cole Status new => backlog
2012-08-11 11:09 David Cole Note Added: 0030223
2016-06-10 14:28 Kitware Robot Note Added: 0041788
2016-06-10 14:28 Kitware Robot Status backlog => resolved
2016-06-10 14:28 Kitware Robot Resolution open => moved
2016-06-10 14:28 Kitware Robot Assigned To => Kitware Robot
2016-06-10 14:31 Kitware Robot Status resolved => closed


Copyright © 2000 - 2018 MantisBT Team