MantisBT - CMake
View Issue Details
0015735CMakeCMakepublic2015-09-10 06:502016-02-01 09:10
Pavel Solodovnikov 
Brad King 
highmajoralways
closedfixed 
CMake 2.8.11 
CMake 3.4CMake 3.4 
0015735: Non-ascii characters in POST_BUILD commands truncated by CMake (if Ninja generator used) on Linux
CMake doesn't handle backslash escaped strings properly in POST_BUILD commands added via "add_custom_command" if provided commands end with non-ascii characters.

Here is a simple example that represents the issue: I have a trivial CMakeLists.txt with one library target:

cmake_minimum_required(VERSION 2.8.11)

project(example)

add_library(ex abc.cpp)

add_custom_command(TARGET ex POST_BUILD COMMAND cmd1 "Пример с пробелами")

For example, I want to execute post build command "cmd1" and pass a parameter "Пример с пробелами" that contains russian characters.

If Ninja generator is specified, CMake generates ill-formed "build.ninja" file. The point of interest in produced build.ninja file looks like that:

build libex.a: CXX_STATIC_LIBRARY_LINKER CMakeFiles/ex.dir/abc.cpp.o
  POST_BUILD = cd /mnt/sources/work/example && cmd1 Пример\ с\
  PRE_LINK = :
  TARGET_PDB = ex.a.dbg

The last word in passed parameter "Пример с пробелами" had been truncated by CMake up to the first encountered space so I can't execute my post build commands properly, since the argument string is corrupted now.

After debugging CMake a little, I've discovered that the source of this error is in function "cmSystemTools::TrimWhitespace" (cmSystemTools.cxx), which doesn't take into account the fact that std::string individual characters are of 'char' type which is (in most circumstances) signed by default.

Hence there are errors in conditions:

  while(start != s.end() && *start <= ' ')
    ++start;

and

  while(*stop <= ' ')
    --stop;

If *stop doesn't fit into 0-127 range(ascii table) as in my example with russian letters, *stop appears to be less than 0, so condition to trim characters is true and it silently chops the last word out of my string until it encounters a space char.

In our projects we use such post-build commands very widely (for example to copy produced binaries to a specific directory which have a cyrillic name) so this issue is somewhat crucial for us.

I've attached a patch (patch.diff inside uploaded archive) that resolves this issue. It simply adds conversion of *start and *stop explicitly to unsigned char.
1) Extract attached example.tar
2) Run "cmake -G "Ninja" ." inside extracted dir.
3) Examine produced build.ninja file.
Affected platform is Linux, Windows uses double quote escaping and the issue does not arise.

This bug persists in CMake releases from 2.8.11 up to the latest 3.3.1 (at least this is what I've observed while looking into sources).
No tags attached.
zip example.zip (1,264) 2015-09-10 06:50
https://public.kitware.com/Bug/file/5520/example.zip
Issue History
2015-09-10 06:50Pavel SolodovnikovNew Issue
2015-09-10 06:50Pavel SolodovnikovFile Added: example.zip
2015-09-10 09:20Brad KingNote Added: 0039392
2015-09-10 09:20Brad KingAssigned To => Brad King
2015-09-10 09:20Brad KingStatusnew => resolved
2015-09-10 09:20Brad KingResolutionopen => fixed
2015-09-10 09:20Brad KingFixed in Version => CMake 3.4
2015-09-10 09:20Brad KingTarget Version => CMake 3.4
2015-09-10 09:46Clinton StimpsonNote Added: 0039393
2015-09-10 09:54Brad KingNote Added: 0039394
2015-09-10 10:07Brad KingNote Added: 0039397
2015-09-10 11:15Clinton StimpsonNote Added: 0039398
2015-09-11 04:53Pavel SolodovnikovNote Added: 0039404
2015-09-11 04:59Pavel SolodovnikovNote Edited: 0039404bug_revision_view_page.php?bugnote_id=39404#r1891
2016-02-01 09:10Robert MaynardNote Added: 0040394
2016-02-01 09:10Robert MaynardStatusresolved => closed

Notes
(0039392)
Brad King   
2015-09-10 09:20   
Thanks. It looks like this was broken here:

 cmSystemTools: Generalize TrimWhitespace to all whitespace
 http://www.cmake.org/gitweb?p=cmake.git;a=commitdiff;h=674f918a [^]

I've now fixed it here:

 cmSystemTools: Fix TrimWhitespace for non-ascii strings
 http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=a0446d93 [^]

See the latter commit message for details.

You can work around this by using the VERBATIM option to add_custom_command:

 add_custom_command(TARGET ex POST_BUILD VERBATIM COMMAND cmd1 "Пример с пробелами")

That changes how the quoting works and happens to avoid this bug in TrimWhitespace.
(0039393)
Clinton Stimpson   
2015-09-10 09:46   
The patch looks good except that isspace can raise an exception for non ascii characters when using the Microsoft runtime. Without looking at the code, I believe the is*() functions elsewhere are preceded with an ascii check.
(0039394)
Brad King   
2015-09-10 09:54   
Thanks, Clinton. Indeed one can see a similar fix for isspace you made here:

 Encoding: Fix debug asserts parsing command line options with non-ascii chars.
 http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=b6b493a4 [^]
(0039397)
Brad King   
2015-09-10 10:07   
Here is an updated fix:

 cmSystemTools: Factor out a cm_isspace helper
 http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=87a9061d [^]

 cmSystemTools: Fix TrimWhitespace for non-ascii strings
 http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=9c4a500f [^]
(0039398)
Clinton Stimpson   
2015-09-10 11:15   
Looks good to me.
(0039404)
Pavel Solodovnikov   
2015-09-11 04:53   
(edited on: 2015-09-11 04:59)
Thanks for quick response. VERBATIM workaround solves the issue partially, because behavior on Windows differs and there are some problems with it.

But I've managed to work around this by appending '/' at the end of parameter string (because in my case it's always a directory name), so that cmake does not trim the name.

(0040394)
Robert Maynard   
2016-02-01 09:10   
Closing resolved issues that have not been updated in more than 4 months.