MantisBT - CMake
View Issue Details
0005380CMakeDocumentationpublic2007-07-20 20:532016-06-10 14:30
Brandon Van Every 
Bill Hoffman 
normaltextalways
closedmoved 
 
 
0005380: REGEX ^ and $ do not match on multi-line <input>
# ^ and $ appear to be non-functional in practice. This makes it
# impossible to code a non-trivial regex in CMake script. This in turn
# forces the user to find or install other tools that can do so, rather
# than keeping their scripting logic self-contained in CMake.
#
# ^ and $ work work with respect to an entire <input> to STRING().
# That is to say, an <input> is treated as one line. The <input>
# does not preserve newlines, even if it is read from a multi-line file.
# The following code snippet demonstrates that ^ and $ will only
# match at the beginning and end of a file, when the file is read in as
# an <input> string.

FILE(WRITE in.txt
"line0
line1
line2")

FILE(READ in.txt stream)

STRING(REGEX MATCH "^line0" line0_start "${stream}")
STRING(REGEX MATCH "line0$" line0_end "${stream}")
STRING(REGEX MATCH "^line1" line1_start "${stream}")
STRING(REGEX MATCH "line1$" line1_end "${stream}")
STRING(REGEX MATCH "^line2" line2_start "${stream}")
STRING(REGEX MATCH "line2$" line2_end "${stream}")

MESSAGE("${line0_start}")
MESSAGE("${line0_end}")
MESSAGE("${line1_start}")
MESSAGE("${line1_end}")
MESSAGE("${line2_start}")
MESSAGE("${line2_end}")

#Output: only line0_start and line2_end match.
#
#C:\in\cbugs>cmake -P regex.cmake
#line0
#
#
#
#
#line2
#
#C:\in\cbugs>
No tags attached.
Issue History
2007-11-23 13:39Brandon Van EveryNote Added: 0009733
2007-11-23 13:39Brandon Van EverySeveritymajor => text
2007-11-23 13:49Brandon Van EveryCategoryCMake => Documentation
2016-06-10 14:27Kitware RobotNote Added: 0041377
2016-06-10 14:27Kitware RobotStatusassigned => resolved
2016-06-10 14:27Kitware RobotResolutionopen => moved
2016-06-10 14:30Kitware RobotStatusresolved => closed

Notes
(0008126)
Brandon Van Every   
2007-07-23 09:58   
A workaround is to use \\r?\\n to match the end of the line. Matching the carriage return 0 or 1 times is necessary because the Unix end of line is just linefeed (LF), but on Windows it's carriage return + linefeed (CR+LF).
(0009733)
Brandon Van Every   
2007-11-23 13:39   
Looking at other regular expression libraries, such as PCRE, it appears that not matching line endings when faced with multi-line input is a valid, if regrettable and archaic, implementation choice. Rather than changing CMake's matching behavior, it should be documented. The documentation should warn that the matches do not work on newlines, only the beginning and ends of strings, so that users of "modern" regular expressions know what's going on.

(0041377)
Kitware Robot   
2016-06-10 14:27   
Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.