[CMake] FindCUDA - creating .ptx and .cubin files as final build target

Sat Sep 11 20:52:44 EDT 2010

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> 
> Alex, thanks for your interest.  There is an option called
> CUDA_BUILD_CUBIN, which builds the cubin along with the OBJ file, but it
> appears to be disabled for PTX compilation.  I'm not exactly sure why,
> and I even wrote it! 
> 
> I'm in the process of adding a new target type to the CUDA_WRAP_SRCS
> macro which currently only supports OBJ and PTX to support CUBINs.  In
> the mean time, you have two options:
> 
> If you want to modify FindCUDA.cmake, you can edit the following lines
> (around line 1049 - depending on your version):
> 
>       set(build_cubin OFF)
>       if ( NOT CUDA_BUILD_EMULATION AND CUDA_BUILD_CUBIN )
>          # comment out this line
>          # if ( NOT compile_to_ptx )
>            set ( build_cubin ON )
>          #  and this line
>          # endif( NOT compile_to_ptx )
>       endif( NOT CUDA_BUILD_EMULATION AND CUDA_BUILD_CUBIN )
> 
> If you can't edit your FindCUDA.cmake script, you can add a new compile
> step that generates the CUBIN from your PTX.
> 
> # Compile the CUDA code to PTX. <my_target> is just a string used to set
> either the shared library flag <my_target>_EXPORTS and the generated
> file names' prefixes.
> CUDA_WRAP_SRCS(my_target PTX generated_ptx_files myfile.cu
> <http://myfile.cu> myfile2.cu <http://myfile2.cu> myfile3.cu
> <http://myfile3.cu>)
> 
> # FindCUDA doesn't look for ptxas, but you can do it yourself:
> find_program(CUDA_PTXAS_EXECUTABLE NAMES ptxas PATHS
> "${CUDA_TOOLKIT_ROOT_DIR}")
> 
> # Now set up the build rules to compile the PTX to CUBINs.
> set(generated_cubin_files)
> foreach(ptx_file ${generated_ptx_files})
>  # You can get creative and use things like get_filename_component() to
> strip off the ptx from the filename.
>  set(generated_file "${ptx_file}.cubin")
>  add_custom_command(
>    OUTPUT ${generated_file}
>    # These output files depend on the source_file and the contents of
> cmake_dependency_file
>    MAIN_DEPENDENCY "${ptx_file}"
>    # Here's the ptxas command
>    COMMAND ${ptxas} "${ptx_file}" -o "${generated_file}"
>    COMMENT "Generating ${generated_file}"
>    )
>   list(APPEND generated_cubin_files "${generated_file}")
> endforeach()
> 
> Then make sure that you add your CUDA files, ptx files, and
> generated_cubin_files to your library or executable target.
> 
Thanks a lot for your suggestions. I really appreciate you taking your
time to answer my question so thoroughly. I took the path of editing my
CMakeLists.txt.

I'm still struggling a bit with some details, but I think I can handle
the rest with some thought on my part. :)
I will share with the list my CMakeLists.txt file once I get all things
nailed down.

> If you have the PTX, why do you need to generate CUBINs at all?  You
> should be able to let the driver compile the CUBIN and be ready to go no
> matter what device you have created.
> 
Compiling .ptx at runtime is an expensive operation, even for simple
code, and it needs to be done separately for every GPU (yes, multi-GPU
user). The compile-for-every-GPU is not a big problem, as for efficient
and readable CUDA device management, multithreading is a must, but 0.6
seconds to compile a .ptx is not exactly on my list of doing things as
fast as they can be done (like 0.5 TFLOP/s on dual 9800GT). The .cubin
files are the first I attempt to load. If they are by any miraculous
chance incompatible with the GPU architecture (say, cubins are made for
compute 1.1, but the GPU is compute 2.0), then the .ptx is the fallback
path. Ideally, the ptx should _never_ be touched.

This shouldn't be much of an issue, considering the source is GPLv3, but
most people just want double-click functionality, so I cannot possibly
predict all the GPU configurations that I may encounter. This is the
safest path, and a good exercise towards bullet-proof programming practice.

I might actually try to implement a persistent way to compile a .ptx the
first time the application is run, and just load the customized .cubin
afterward, but I've had my head wrapped around other issues that I
haven't written any new code in the last six months. My priority right
now is getting used to cmake. I was getting very tired of editing
makefiles and Visual Studio project files, and making sure one didn't
break the other.

While I'm at it, if you want to poke fun at me :p , my project is hosted
here:
http://code.google.com/p/electromag-with-cuda/

Alex

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJMjCRXAAoJEL0kPHNPXJJKIiwQAIzpzEYRoPga3AIRR/1Je5a1
xdSvdDZM90iT9YW1sqIRfNrMXeCHjltb5/ZSfYmRs/gKSyAx9okPiJW9cSu12vWF
YBFuqxGOvOqMb+WQ8Nabi+n39GDalK/bqOfUJ5N0JJPImnA1rVH8xiS78JyCYXbH
ZEjSJT1vGV14fYvZpBOvSnRt6lkEtpX3c3zlIzkA7zRgXXu7o7p0/Mn9v2yPN7DA
ABq2XPcchDsN+RN3rArKRcjjeuf5AKD9coXM6S4gZ7j8PvIneEdBcLxJLMVTHEJo
0GZJhCMkSfYsqeoUTwZZgkRHUKE0qfFQYq3jnA8HP/AHT3eyEkJsJCfEGHCdHApO
879gi6k1C0f+nioQNedQfqczZwCPGBhPZaDjVWpoyjqOA5Al9917LBuR6mUeyL/b
UE3bNY4vlSH3Hs6BYRTYS4h0SRNcOU931/5bBCeoBJ9Hc7s6oIjRx6Bgj+65+bVU
CJ9F6v03k4MniS5yHFi/rrCiFmlnYp/nSPruiGEfs2C5cgPZFX3+OEmnSAAIzb6a
J7shLD1zaCwwq8gIn6P14jEKsnw6S+ExaP9lqOxHUMqRj7xHX6JJjeozPPXvVCcX
L/rpygoBtRPgttklEminHka+X5VBn92EE4ts+OrIj4NdRvGa4ob3JrUW5QLk5/oK
2n4MFn5kz7xW1stDpzfx
=tiy6
-----END PGP SIGNATURE-----