[cmake-developers] slow regex implementation in RegularExpression

Pau Garcia i Quiles pgquiles at elpauer.org
Wed Nov 16 13:04:31 EST 2011


On Wed, Nov 16, 2011 at 6:44 PM, Alexandru Ciobanu
<alex at rogue-research.com> wrote:

> As it can be seen re2 and the standard regex.h are orders of magnitude
> faster in executing this particular regular expression.
> The difference between PCRE and re2 is also confirmed by this study:
>     http://swtch.com/~rsc/regexp/regexp3.html

re2 does not implement some important PCRE features, such as
backreferences, lookahead, lookbehind, recursion, etc:

http://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines
http://developer.qt.nokia.com/wiki/Regexp_engine_in_Qt5

This is the reason Qt5 has discarded re2 to replace QRegExp and is
considering adding UTF-16 support in PCRE.


> CONCLUSTION:
>    - PCRE is not fast enough

Yet this is the syntax and features that most developers would expect to find.


> QUESTION:
>    - is there a reason we shouldn't use the standard regex.h?

In addition to not implementing PCRE (which is what anyone would
expect to find in a regexp engine these days), regex.h is not
available in MSVC. MinGW does provide an implementation on Windows,
though.

There are two other regexp implementations Qt has considered:

- Boost::Regex. Discarded because Boost does not keep ABI
compatibility, which is important for Qt (but I'd say it's not for
CMake)

- C++11. Not possible if support for old compilers is required.

Alex, have you tried those two?


-- 
Pau Garcia i Quiles
http://www.elpauer.org
(Due to my workload, I may need 10 days to answer)



More information about the cmake-developers mailing list