<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">I’ve attached two patches that fix bugs in the CUDA_COMPILE{,_PTX,_FATBIN,_CUBIN} macros from FindCUDA.cmake.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">First bug (fixed by patch #1)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Commit 7ded655 added generator expressions in CUDA_WRAP_SRCS to scrape include directories and compile definitions off of the target. This works great when the target name passed to CUDA_WRAP_SRCS is an actual target (like in cuda_add_library
and cuda_add_executable). However, the CUDA_COMPILE* macros also use CUDA_WRAP_SRCS, and they pass in a hardcoded name that doesn’t represent a real target. This breaks the generator expressions, causing CMake to abort during generation.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I fixed this by teaching CUDA_WRAP_SRCS to check for “PHONY” in its argument list. When CUDA_WRAP_SRCS sees “PHONY”, it queries the appropriate directory properties (INCLUDE_DIRECTORIES and COMPILE_DEFINITIONS) instead of using the generator
expressions. I then modified cuda_compile_base (which is used internally by all the CUDA_COMPILE* macros) to pass PHONY to CUDA_WRAP_SRCS.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Second bug (fixed by patch #2)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">In a couple spots, CUDA_WRAP_SRCS assumes that the passed-in target name is unique – for example, the name of the directory containing the intermediate output is built from the target name. However, the CUDA_COMPILE* macros always pass
the same hardcoded target name. So, if you call the same macro twice in a directory, some of the generated files from the second call will silently overwrite those from the first call.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I fixed this by adding a counter (_cuda_internal_phony_counter) as a directory property. The counter gets incremented every time cuda_compile_base is called, and the value of the counter is appended to the hardcoded target name that gets
passed to CUDA_WRAP_SRCS. This ensures that each call to the macro has its own unique target name.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks!<o:p></o:p></p>
<p class="MsoNormal">Stephen Sorley<o:p></o:p></p>
</div>
</body>
</html>