[CMake] Fwd: Save stripped debugging information

Michael Hertling mhertling at online.de
Sun Oct 16 08:40:27 EDT 2011


On 10/14/2011 12:58 AM, Rolf Eike Beer wrote:
> Am Donnerstag, 13. Oktober 2011, 20:09:04 schrieb Michael Hertling:
>> On 10/12/2011 03:40 PM, Rolf Eike Beer wrote:
>>>> On 10/03/2011 09:18 AM, Yuri Timenkov wrote:
>>>>> Hi Michael,
>>>>>
>>>>> On Sun, Oct 2, 2011 at 6:07 PM, Michael Hertling
>>>>>
>>>>> <mhertling at online.de>wrote:
>>>>>> On 10/01/2011 10:07 AM, Yuri Timenkov wrote:
>>>>>>> that's the problem: you don't know neither file name nor it's
>>>>>>> location,
>>>>>>> especially in multi-configuration generators.
>>>>>>
>>>>>> You *do* know a debug file's name and location, either because you
>>>>>> must generate it explicitly (objcopy approach) or via the
>>>>>> concerned target's $<TARGET_FILE_DIR:...> generator expression in
>>>>>> custom targets/commands (Visual approach). Otherwise, a debug
>>>>>> file with unknown name and/or location would be rather useless.
>>>>>
>>>>> I'm sorry that you misunderstood me, but I was concerned about
>>>>> install() commands.
>>>>>
>>>>> CMake already does a lot of things to make different platforms and
>>>>> generators work in same way. As it was noted before, install already
>>>>> knows
>>>>> about such subtleties between generators and platforms like export
>>>>> libraries. So why not make it aware of separate debug information?
>>>>
>>>> Because, IMO, the platform/generator independent extraction and/or
>>>> recognition of debug info files is significantly more difficult to
>>>> specify than the handling of DLLs along with their import libraries
>>>> in VS, e.g. AFAICS, the latter works well since the import libraries
>>>> are generated next to their DLLs sharing the same base name, and both
>>>> are covered by INSTALL()'s RUNTIME/ARCHIVE DESTINATION clauses which
>>>> also have a defined meaning on *nix. For VS alone, the installation
>>>> of debug info files would in fact be as easy as the installation of
>>>> an import library, but how would you specify and parameterize this
>>>> task on *nix? With the GNU toolchain, CMake can't know in advance if
>>>> and how debug info files are generated, where they're written to etc.
>>>
>>> I think you got something reverse here.
>>>
>>> To actually get the debug information in a separate file with binutils
>>> you need to run objcopy. [...]
>>
>> No, you can use the full symbol-equipped executable as debug info file:
>>
>> <cite src="man objcopy">
>>
>> Also the --only-keep-debug step is optional. You could instead do this:
>>
>>    1. Link the executable as normal.
>>    2. Copy foo to foo.full
>>    3. Run objcopy --strip-debug foo
>>    4. Run objcopy --add-gnu-debuglink=foo.full foo
>>
>> i.e., the file pointed to by the --add-gnu-debuglink can be the full
>> executable. It does not have to be a file created by the
>> --only-keep-debug switch.
>>
>> </cite>
>>
>> As you can see, the debug info file is *not* generated by objcopy; it's
>> just a copy of the original non-stripped binary. The --only-keep-debug
>> switch is virtually the complement of --strip-debug; essentially, it
>> removes non-debug-related sections - convenient but not necessary.
> 
> Yes it's a copy. So someone has to do this. And to know the location. And this 
> is CMake. So CMake will do the copy, call the objdump and do other fancy 
> things. But it is completely driven by CMake, so CMake knows which files are 
> involved. It's not that at any point CMake is calling a program that would 
> magically create files at places that CMake doesn't know of before, all output 
> filenames will be specified by CMake.

Actually, my intention was to point out that generating debug info
files on *nix is not equivalent to invoking objcopy; there're other
methods, and we do not know which utility Y the toolchain X uses for
this purpose. Apart from that, we can readily agree on the assumption
that debug info files have certain names on *nix which are also known
to CMake if this is believed to ease the further discussion.

> The only thing that may "magically" happen is the generation of the build id. 
> But that will not create a file on it's own, and CMake can easily read the 
> build id out of the generated executable. The file named after the build id is 
> later created by cp, objcopy, or CMake. And once again CMake would know about 
> the name.

Indeed, the problem w.r.t. the build ID is not an additional file, but
the fact that (1) it must already be taken into account for the linking
step, (2) must be provided with a user-supplied value and (3) the debug
info file must be installed to or linked from a certain directory. If a
cross-platform unification is to make sense, you must enhance the link
step as well as the install step. Suppose this CMakeLists.txt excerpt:

ADD_EXECUTABLE(main main.c)
INSTALL(TARGETS main RUNTIME DESTINATION bin DEBUG DESTINATION ...)

Suppose further a *nix user A wants to create and install a debug file
for main with objcopy --only-keep-debug/--add-gnu-debuglink, whereas
user B wants to do this with a build ID. What do you suggest that A
and B should add to their respective CMakeLists.txt to accomplish
their goals? Just conceptually, no implementation details.

>>> -optionally, and only on installation, check if the file has a build-id
>>> (using readelf, objdump or whatever) [...]
>>
>> objdump -s -j .note.gnu.build-id <binary>
>>
>> BTW, this would make INSTALL() depend on objdump or another tool.
> 
> If you specified to put the debug info in a separate file, yes. Where is the 
> problem? CTest depends on gcov if I want coverage information. It depends on 
> valgrind if I want memleak checks. If I want debug symbols, I would need to 
> have obj* around. Since I can't get it in another way anyway, where is the 
> problem?

What does CTest have to do with it? There's talk of CMake's INSTALL()
command, and on *nix, AFAICS, the strip utility is the only external
tool the INSTALL() command relies on. Unlike obj{copy,dump}, strip is
a POSIX standard, although hardly specified - with intent - but CMake
invokes it without options, i.e. there're no STRIP_FLAGS. Furthermore,
strip's invocation is hard-coded in CMake's C++ code base, essentially
in Source/cmInstallTargetGenerator.cxx, and I wonder whether it's wise
to add dependencies on other toolchain-specific non-POSIX utilities to
the INSTALL() command in this way.

>>> [...] and place a link from
>>> /usr/lib/debug/.build-id (or whereever) to the debug file
>>
>> The fact that a binary contains a build ID does not necessarily mean
>> that it is or has a debug info file; a build ID can also serve other
>> identificatory purposes. So, this check will probably turn out to be
>> insufficient since these cases can't be distinguished automatically.
> 
> The build id indicates where the debug info should go to if it is generated. 
> If I don't specify to install debug information CMake doesn't need to check 
> for it. If I enable this CMake can look and knows to which place to install 
> the files. I still don't see any point where CMake wouldn't know the filenames 
> to put the files to in advance (i.e. before it calls the tools that would 
> actually do this).

Again: The mere presence of a build ID does *not* indicate that the
user intents to generate a debug info file or even wants one to be
installed. See ld's manpage for the --build-id option, and you will
never once read the word "debug". However, you will read the words
"uuid", "sha1" and "md5": GNU's ld may embed a build ID for quite
arbitrary purposes, and its presence must not initiate any actions
by CMake on its own. It's well possible that one wants to generate
a debug info file by objcopy --only-keep-debug/--add-gnu-debuglink
and provides LINK_FLAGS==--build-id=0x... without wanting to have
the file placed in a build-id subdirectory; this information must
be passed to CMake additionally. The question is still how CMake
should handle a build ID for connecting binaries with their debug
info files, not how to learn of the latters' names - as mentioned
above, let's assume that they're known. A possible approach I can
imagine is a DEBUG_BUILD_ID target property, but this would need
to be transformed to "--build-id=0x..." for the linker's command
line and also taken into account by the INSTALL() command. Where
would you suggest that this logic should be implemented?

>>> So AFAICT there is no magic knowledge or searching for something
>>> involved
>>> at all.
>>>
>>> Or am I getting something seriously wrong here?
>>
>> To my mind, the discussion is not about debug-link vs. build-ID or
>> objcopy vs. not-objcopy; these are just examples for the diversity
>> of the debug info files' generation and processing. Instead, the
>> discussion is actually about:
>>
>> (1) Should CMake be taught to generate debug info files by itself?
>>     While this is trivial with VS, it is - IMO - hardly feasible on
>>     *nix in a manner that does not restrict the user's possibilities.
>>
>> (2) Should CMake just install debug info files, the latters generated
>>     before by the project? While this would probably be quite easy to
>>     implement for VS, it is - IMO - hardly feasible on *nix without
>>     (1) as CMake can't know the files' names and locations a priori.
>>     Of course, one might agree on a convention that the debug info
>>     file of a target xyz is to be named xyz.debug and placed next
>>     to xyz.
> 
> Where is the problem in this? Doing the installation to the correct location 
> according to the build id could be an (optional) add-on if a build-id is 
> present.

If the build ID is actually intended to connect the binary with the
debug info file: Okay, but again, where should this be implemented
without introducing toolchain-specific code to CMake's C++ code
base? Recall that strip is hard-coded but POSIX standard, and
it's invoked in a POSIX-compliant way, i.e. without options.

>> However, a possible approach to this topic I could imagine is the
>> following: Add a new target property DEBUG_INFO_FILES[_<CONFIG>],
>> perhaps initialized with $<TARGET_FILE_DIR:target>.pdb in VS and
>> $<TARGET_FILE_DIR:target>.debug on *nix. If this property is set,
>> this means that the denoted debug info files are to be generated,
>> and INSTALL(TARGETS target DEBUG DESTINATION ...) would know where
>> to find them. Regarding the Makefile generators, we would probably
>> need an additional rule variable CMAKE_<LANG>_GENERATE_DEBUG_INFO
>> or the like, but this would not account for an early role of the
>> linker with potential user-supplied parameters. If you have any
>> suggestions how to conceptually realize it, I'd be interested.
> 
> Sounds not too bad.

If we come back to Pawel's objection w.r.t. stripped PDB files in VS,
how should CMake cope with toolchains that create more than one debug
info file per target? VS's /PDBSTRIPPED option is a good example, and
also on *nix, one can generate multiple files with different sections
removed; objcopy --only-keep-debug is just an extreme. How does this
relate to a convention about the files' names for CMake to know, and
how should INSTALL(TARGETS ... DEBUG DESTINATION ...) behave if the
files are not to be installed to the same location? The latter is
quite probable with full and stripped PDB files in VS, IMO.

Regards,

Michael


More information about the CMake mailing list