MantisBT - CMake
View Issue Details
0005537CMakepublic2007-08-19 01:582008-10-01 17:05
Brandon Van Every 
Bill Hoffman 
normalcrashalways
closedwon't fix 
 
 
0005537: REGEX MATCH and MATCHALL can be pathologically slow
STRING(REGEX MATCH ...) and STRING(REGEX MATCHALL ...) are pathologically slow when given regexes of the form "([a-z]+ *)+\r?\n" even for tiny input streams, such as 30 characters. This pattern is used to detect a line containing a whitespace separated list of words, which is extremely important when parsing files. With larger files, the regex can be so slow that CMake appears to hang indefinitely. Even a 3GHz PC with 1GB RAM can be brought to its knees.

More generally, patterns of the form "([^a]+a*)+a" exhibit the problem. A workaround is to express the pattern as "[^a]+(a+[^a]+)*a*". A .zip file containing a reproducer script and a sample input file is attached.
No tags attached.
zip slow.zip (1,237) 2007-08-19 01:58
https://public.kitware.com/Bug/file/1114/slow.zip
Issue History
2007-08-19 01:58Brandon Van EveryNew Issue
2007-08-19 01:58Brandon Van EveryFile Added: slow.zip
2007-12-17 17:56Bill HoffmanStatusnew => assigned
2007-12-17 17:56Bill HoffmanAssigned To => Bill Hoffman
2007-12-31 04:46Brandon Van EveryNote Added: 0010034
2008-10-01 17:05Bill HoffmanStatusassigned => closed
2008-10-01 17:05Bill HoffmanResolutionopen => won't fix

Notes
(0010034)
Brandon Van Every   
2007-12-31 04:46   
Patterns of the form "a([^x]+)+a" where 'x' is any character other than 'a' also exhibit the problem. The problem appears to be due to the 2 levels of +. It also happens with ( *)+ and ( +)* and ( *)*. It doesn't happen when there's only 1 level of + or *.