[CMake] 1 tricky question, 1 bug report

Sat Mar 14 10:26:34 EDT 2009

Maik Beckmann schrieb am Samstag 14 März 2009 um 00:02:

> However, a look at cmFortranLexer.cxx shows that BEGIN is defined as
>   #define BEGIN yyg->yy_start = 1 + 2 *
> which IMHO means that the lexer can only have one state at a time.  If
> str_dq or str_sq is set, the desired *_fmt state is lost.

A closer look at the source shows me, that it at least wants to take care of 
restoring the old start state after the fortran string literal is closed.  
Unterminated strings break this
{{{
<str_dq,str_sq>\n {
  unput ('\n');
  BEGIN(INITIAL);
  return UNTERMINATED_STRING;
}
}}}
which can be fixed.

> Given that this is the case, lexer states are the wrong tool for the
> problem.

I'm not sure about my statement anymore.  I'm looked at a copy of "lex & yacc" 
(O'Reilley) which contains a section about combining lexers using start 
states.  It suggests to put some code to manage the states before yylex is 
called like this:
{{{
..
%%
// will be processed before yylex
%{
 // manage states before yylex is called
%}
...
%%
..
}}}

The book also mentions the disadvantage, that only one lexer can be active.  
This would only be a problem if one format is allowed to $/#include the other 
format.  I never tried it, will do.  This is the _first_ option.

The _second_ option proposed in the book is to have two distinct fortran 
lexers (and probably two parsers).  This is the cleanest solution, but would 
be engineered for this special problem, IMHO. 

The _third_ option is to make the lexer format agnostic and handle formats at 
the parser code.  This should work well, since we just have to make sure that 
the valid MODULE und USE statements are catched.  Remeber, so far the free 
format code processes fixed format code proper most of time.

Best,
 -- Maik