MantisBT - CMake
View Issue Details
0014238CMakeModulespublic2013-06-20 12:412016-06-10 14:31
Daniel Frenzel 
James Bigler 
urgentmajoralways
closedmoved 
GccLinux
CMake 2.8.11.1 
 
0014238: FindCUDA.cmake not able to link separate device code because arch=-sm20 is not set. FIXIT!
Build fails when SET(CUDA_SEPARABLE_COMPILATION ON) is set.
There is a "-arch=-sm20" missing in the correct cmake function where linking happens. It is because set parameters get discarded (why?) and arch=-sm20 is not default. NVCC throws an error then.
Build with "SET(CUDA_SEPARABLE_COMPILATION ON)"
I wrote a small fix. The new FindCUDA.cmake is in the appendix. Use patch for the line.
No tags attached.
? FindCUDA.cmake (66,944) 2013-06-20 12:41
https://public.kitware.com/Bug/file/4798/FindCUDA.cmake
Issue History
2013-06-20 12:41Daniel FrenzelNew Issue
2013-06-20 12:41Daniel FrenzelFile Added: FindCUDA.cmake
2013-06-20 13:15Robert MaynardNote Added: 0033356
2013-06-20 13:26Robert MaynardNote Added: 0033357
2013-06-20 13:26Robert MaynardAssigned To => James Bigler
2013-06-20 13:26Robert MaynardStatusnew => assigned
2013-06-20 14:57James BiglerNote Added: 0033359
2013-06-20 15:10Robert MaynardNote Added: 0033360
2013-06-20 15:53Daniel FrenzelNote Added: 0033361
2013-06-20 16:06James BiglerNote Added: 0033362
2013-06-20 17:29Daniel FrenzelNote Added: 0033364
2016-06-10 14:29Kitware RobotNote Added: 0042302
2016-06-10 14:29Kitware RobotStatusassigned => resolved
2016-06-10 14:29Kitware RobotResolutionopen => moved
2016-06-10 14:31Kitware RobotStatusresolved => closed

Notes
(0033356)
Robert Maynard   
2013-06-20 13:15   
You should set your CUDA_NVCC_FLAGS to to "-arch=-sm20" if you need stream multiprocessor support set to 2.0

These options are highly configurable and no default is valid for all use cases. Some people want 1.3 only support, or just 3.0 and greater.
(0033357)
Robert Maynard   
2013-06-20 13:26   
It does look like CUDA_NVCC_FLAGS are being properly used when enabling CUDA_SEPARABLE_COMPILATION, but forcing sm_20 is not the desired behavior.
(0033359)
James Bigler   
2013-06-20 14:57   
Robert is correct, you can't arbitrarily add -arch sm_20 since you might want something else. In addition, if there is already an -arch sm_20 flag set elsewhere nvcc will complain.

There is at least one bug, and that is CUDA_NVCC_FLAGS are not added to the nvcc_flags in the CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS function. This was an oversight on my part when implementing this.

Another thing that could be done is to search for an arch flag and warn the user that no sm_20+ flags have been set, but I believe nvcc will tell you this already when you attempt to use the flags for separable compilation without setting an arch flag.
(0033360)
Robert Maynard   
2013-06-20 15:10   
James, you are correct nvcc will complain if you don't set arch to sm20 or greater with separable compilation enabled.
(0033361)
Daniel Frenzel   
2013-06-20 15:53   
I just wanted to point out where the problem happens. I was more interested to make my library beeing built. But James is right CUDA_NVCC_FLAGS are not added to the nvcc_flags when it should build the intermediate. Would be great when it will get fixed soon.

Greetings
(0033362)
James Bigler   
2013-06-20 16:06   
As a work around the arguments from the OPTIONS section of cuda_add_library *are* passed along to the separable compilation phase. If you put your -arch sm_20 flag in there it should work (and this is what I ended up testing during development).

cuda_add_library(target .... OPTIONS -arch sm_20)
(0033364)
Daniel Frenzel   
2013-06-20 17:29   
I try it. Thanks btw. for the cmake module itself. I really like it. Without it, everything would be more complicated.
(0042302)
Kitware Robot   
2016-06-10 14:29   
Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.