View Issue Details [ Jump to Notes ] | [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
0013806 | CMake | CMake | public | 2012-12-20 08:52 | 2016-06-10 14:31 | ||||
Reporter | Andreas Mohr | ||||||||
Assigned To | Kitware Robot | ||||||||
Priority | none | Severity | text | Reproducibility | always | ||||
Status | closed | Resolution | moved | ||||||
Platform | PC | OS | Linux | OS Version | RHEL5 | ||||
Product Version | CMake 2.8.10.2 | ||||||||
Target Version | Fixed in Version | ||||||||
Summary | 0013806: list(SORT) produces unavoidable data corruption (likely root cause: improper semi-colon string *payload* handling in CMake) | ||||||||
Description | I just wanted to extend a test case for my work on a new ENVIRONMENT option to add_custom_command() (adding many "interesting" env var key/value tests for escaping of XML, batch, shell, ... specials), and ended up realizing that any escaped semi-colon string content will end up getting broken, hard, by CMake. The test provided below will produce the following output: $ cmake . tlist pre-sort: hi;there cruel;world tlist post-sort: cruel world hi there extra_escaped_list pre-sort: hi;there cruel;world extra_escaped_list post-sort: cruel world hi there extra_escaped_list_i_mean_it pre-sort: hi\\ there cruel\\ world extra_escaped_list_i_mean_it post-sort: cruel\\ hi\\ there world -- Configuring done -- Generating done This shows that any SORT of the list will cause unsurviveable data corruption, which is a major, INSURMOUNTABLE problem (as the futile attempts at backslash-escaping in the test case below illustrate) when attempting to compile a list of patterns for escape tests towards rather unrelated system components. This wide-spread data corruption by CMake core layers should be fixed quickly (we're now at 2.8.x, finally having reached good usability on the very large number of systems that CMake supports, where I really wouldn't have expected such data corruption issues to have remained). I'd deem data corruption bugs to be of the near-highest level on the priority scale (with perhaps actual security issues topping it), thereby assigning priority urgent. (I did read "[sldev] Semicolons and CMake" https://lists.secondlife.com/pipermail/sldev/2009-April/013502.html [^] , and have to admit I walked away unconvinced) This should possibly be handled by introducing a new case to the CMake policy mechanism, to preserve the (reportedly quite important in this case) bug-for-bug compat in older code. If it isn't possible to fix this problem cleanly (or e.g. not in a single evolution step), then one should think of other possibilities to be able to work around the currently unavoidable data corruption. One way might be to introduce a special CMAKE_ESCAPE_* variable which when inserted marks content in a special manner to ensure proper handling. Or possibly one could add new cmake_escape_*() function helpers rather than resorting to often unclean global-variable-based handling. Thanks! | ||||||||
Steps To Reproduce | cmake_minimum_required(VERSION 2.8) project(list_escape_semicolon_test NONE) function(show_list _title _list) message("${_title}:") foreach(elem_ ${_list}) message("${elem_}") endforeach(elem_ ${_list}) message("") endfunction(show_list _title _list) function(process_list _name _list) show_list("${_name} pre-sort" "${_list}") list(SORT _list) show_list("${_name} post-sort" "${_list}") endfunction(process_list _name _list) function(test_list_sort_escaping) set(tlist "") list(APPEND tlist "hi\;there") list(APPEND tlist "cruel\;world") process_list(tlist "${tlist}") set(tlist "") list(APPEND tlist "hi\\;there") list(APPEND tlist "cruel\\;world") process_list(extra_escaped_list "${tlist}") set(escape_string "\\\\") set(tlist "") list(APPEND tlist "hi${escape_string};there") list(APPEND tlist "cruel${escape_string};world") process_list(extra_escaped_list_i_mean_it "${tlist}") endfunction(test_list_sort_escaping) test_list_sort_escaping() | ||||||||
Additional Information | That's now the second (and unrelated) time in about two weeks that I stumbled (and fell) over this (the first time being reading in a file(STRINGS) with semi-colon payload and iterating over elements of the resulting list). | ||||||||
Tags | No tags attached. | ||||||||
Attached Files | |||||||||
Relationships | |
Relationships |
Notes | |
(0031931) Brad King (manager) 2012-12-20 09:23 |
CMake is not a general-purpose data processing language. Semicolons are simply not supported in list values very well (nor are square brackets). Semicolon-separated lists were originally created to handle lists of source files e.g. set(srcs a.c b.c) add_executable(a ${srcs}) so the implementation, created in the early days, did not take use cases beyond that into account. Back then CMake was used only for our own projects so we didn't need anything more robust. Fixing this will require at least the following: 1. Teach list expansion parsing to handle escapes correctly. Replace '\\' with '\' and '\;' with ';' without dividing. There is a partial implementation of this already but doesn't quite work right. 2. Teach list construction to generate escapes correctly. Replace '\' with '\\' and ';' with '\;' in values. Even if this is fixed in the C++ list construction cases there will still be projects that make their own lists via string manipulation whose behavior would be changed by step 1. 3. I'm not sure what to do about the []-nesting cases. 4. I'm not sure what to do about backward compatibility. At least a policy will be needed, but it could be quite difficult to get right. |
(0031938) Brad King (manager) 2012-12-21 08:45 |
Re 0013806:0031931: Having thought about this for a day my conclusion is that this is not worth fixing. The behavior has been this way for over 10 years and projects have gotten by with it and many may now depend on it. It's not even clear what the proper behavior would be if this were fixed because there are ambiguities. For example, does set(x "1;2;3") store a list of three elements, a list of one element containing semicolons, or a non-list string value? (This question is rhetorical.) Of course list behavior could have been implemented more carefully from the beginning but it is too late to change behavior this fundamental to the language. Data processing is not a design goal. I think the most we can do is improve documentation to warn about cases involving ';' inside values and nested inside '[]'. |
(0042177) Kitware Robot (administrator) 2016-06-10 14:28 |
Resolving issue as `moved`. This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page. |
Notes |
Issue History | |||
Date Modified | Username | Field | Change |
2012-12-20 08:52 | Andreas Mohr | New Issue | |
2012-12-20 09:23 | Brad King | Note Added: 0031931 | |
2012-12-21 08:45 | Brad King | Note Added: 0031938 | |
2012-12-21 08:45 | Brad King | Priority | urgent => none |
2012-12-21 08:45 | Brad King | Severity | major => text |
2012-12-21 08:45 | Brad King | Status | new => backlog |
2016-06-10 14:28 | Kitware Robot | Note Added: 0042177 | |
2016-06-10 14:28 | Kitware Robot | Status | backlog => resolved |
2016-06-10 14:28 | Kitware Robot | Resolution | open => moved |
2016-06-10 14:28 | Kitware Robot | Assigned To | => Kitware Robot |
2016-06-10 14:31 | Kitware Robot | Status | resolved => closed |
Issue History |
Copyright © 2000 - 2018 MantisBT Team |