[CMake] option bug ?

Mon Jul 12 11:14:14 EDT 2010

On 07/07/2010 09:44 AM, Michael Wild wrote:
> 
> On 7. Jul, 2010, at 9:32 , Michael Hertling wrote:
> 
>> On 07/03/2010 01:03 AM, Chris Hillery wrote:
>>> There's a slightly nicer work-around: Change project A's CMakeLists to set
>>> PROJB_OPENCV_LINK as a cache variable, ie, SET(PROJB_OPENCV_LINK NO CACHE
>>> BOOLEAN "doc"). I've tested it locally and it works the way you want it to.
>>>
>>> It seems that CMake divides the world of variables into two classes: cache
>>> variables and non-cache variables. Somewhat unfortunately, the same
>>> function, SET(), is used to specify values for both kinds, and cache
>>> variables "hide" any non-cache variables with the same name. The upshot is
>>> that the same SET() command will do different things depending on what's
>>> currently in the cache.
>>>
>>> Further confusion here comes from the fact that when a variable is declared
>>> as a cache variable (using either option() or set(...CACHE...) ), any
>>> current value that the non-cache variable with the same name has is
>>> discarded. So the first time you run cmake, PROJB_OPENCV_LINK isn't a cache
>>> variable until it gets to processing projb's CMakeLists.txt, hence the
>>> non-cache value you provided gets dropped. The second time, it's already a
>>> cache variable, so project A's CMakeLists actually sets the cache variable,
>>> and therefore projb's CMakeLists sees it as you expect.
>>>
>>> It's definitely confusing, but I'm not totally sure what the right solution
>>> is. It probably would have been cleaner if CMake made the distinction clear
>>> between cache and non-cache variables, but it's far too late to change that
>>> now. Maybe it would be possible to change it such that a cache variable
>>> declaration (option() or set(...CACHE...) ) would allow a current non-cache
>>> variable of the same name to override the declaration's default value, in
>>> the same way that -D on the command-line does.
>>
>> IMO, things aren't sooo bad. ;-)
>>
>> W.r.t. the value of a variable, CMake knows scopes and the cache. A new
>> scope is entered by ADD_SUBDIRECTORY() or a function's invocation. When
>> referring to a variable's value by the "${}" operator you get the value
>> from the current scope. At the start of a CMake run, the variables are
>> initialized with the values from the cache, provided the latter exists
>> and is appropriately populated. The SET() command - that is the actual
>> source of confusion along with OPTION() - basically has four flavours:
>>
>> (1) SET(VAR "xyz") sets the value of VAR in the current scope to "xyz",
>>    i.e. "${VAR}" yields "xyz" until the value of VAR is changed anew.
>> (2) SET(VAR "xyz" PARENT_SCOPE) sets the value of VAR in the parent's
>>    scope to "xyz", but doesn't affect the current scope or the cache.
>> (3) SET(VAR "xyz" CACHE STRING "..." FORCE) sets VAR's value in the
>>    current scope and in the cache to "xyz" regardless if there's
>>    already a cached value or VAR is defined in the current scope.
>> (4) SET(VAR "xyz" CACHE STRING "...") sets VAR's value in the cache
>>    to "xyz" unless there's already a cached value for VAR, and the
>>    latter's value in the current scope is set from the cache if
>>    (a) the SET() writes to the cache, or
>>    (b) VAR is undefined in the current scope, or
>>    (c) the type of VAR in the cache is UNINITIALIZED.
>>
>> While (4a,b) are quite reasonable, (4c) is somewhat strange as it
>> yields different results for apparently equivalent invocations:
>>
>> CMAKE_MINIMUM_REQUIRED(VERSION 2.8 FATAL_ERROR)
>> PROJECT(VARS NONE)
>> MESSAGE("VAR{1,2}[CACHE]: ${VAR1},${VAR2}")
>> SET(VAR1 "abc")
>> SET(VAR2 "abc")
>> MESSAGE("VAR{1,2}[LOCAL]: ${VAR1},${VAR2}")
>> UNSET(VAR2)
>> SET(VAR1 "xyz" CACHE STRING "")
>> SET(VAR2 "xyz" CACHE STRING "")
>> MESSAGE("VAR{1,2}[FINAL]: ${VAR1},${VAR2}")
>>
>> Cmaking from a clean build directory yields, as expected, (4a):
>>
>> VAR{1,2}[CACHE]: ,
>> VAR{1,2}[LOCAL]: abc,abc
>> VAR{1,2}[FINAL]: xyz,xyz
>>
>> Afterwards, "cmake -DVAR1:STRING=pqr -DVAR2:STRING=pqr ." yields:
>>
>> VAR{1,2}[CACHE]: pqr,pqr
>> VAR{1,2}[LOCAL]: abc,abc
>> VAR{1,2}[FINAL]: abc,pqr
>>
>> So, VAR1 is finally not set from the cache, but VAR2 is as it's
>> undefined in the current scope at that moment; this proves (4b).
>>
>> Now, "cmake -DVAR1=pqr -DVAR2=pqr ." reveals (4c):
>>
>> VAR{1,2}[CACHE]: pqr,pqr
>> VAR{1,2}[LOCAL]: abc,abc
>> VAR{1,2}[FINAL]: pqr,pqr
>>
>> The parameter "-DVAR1=pqr", i.e. without a type, supplies the cache
>> with "VAR1:UNINITIALIZED=pqr" for VAR1, and the subsequent command
>> SET(VAR1 "xyz" CACHE STRING "") changes VAR1's type to STRING, but
>> does not touch the cached value; though, the latter is written to
>> VAR1 in the current scope. Here, I'm in doubt if this behaviour is
>> really intended.
>>
>> To summarize: If none of (4a-c) holds, i.e. an already cached value
>> for VAR with a type other than UNINITIALIZED and VAR defined in the
>> current scope, SET(VAR "xyz" CACHE STRING "...") just does nothing.
>>
>> It's that (4a-c) which causes the confusion in regard to a variable's
>> value in the cache and the current scope, and as OPTION(VAR "..." ON)
>> is, AFAIK, quite the same as SET(VAR ON CACHE BOOL "..."), the above-
>> mentioned considerations apply accordingly. So, the rule of thumb is
>> to differentiate cleanly between variables to be used with the cache
>> and variables to be used as usual, or in other words: If one wants a
>> variable to be set to a cached value one shouldn't use it afore, i.e.
>> saying SET(VAR "xyz" CACHE STRING "...") with VAR being undefined at
>> that moment behaves as desired: If there's already a cached value it
>> gets written to VAR, otherwise "xyz" is written to VAR and the cache
>> and, thus, serves as a fallback value. In short: Go the (4b) way. ;)
>>
>> Regards,
>>
>> Michael
>>
>> P.S.: There has been several discussions of this issue, and it
>>      has even been considered to introduce a related policy:
>>
>> <http://www.mail-archive.com/cmake@cmake.org/msg20930.html>
>> <http://www.mail-archive.com/cmake@cmake.org/msg21394.html>
>> <http://public.kitware.com/Bug/view.php?id=9008> - ongoing!
> 
> 
> The short rule from all this is:
> 
> *NEVER* call SET(VAR "xyz" CACHE ...) *AFTER* you called SET(VAR "abc").

Yes, this is the usually recommended procedure to avoid such problems
with the peculiarities of SET(), but it is also counterintuitive, IMO,
that SET(VAR "abc") has an impact on a later SET(VAR "xyz" CACHE ...).

> Calling SET(VAR "abc") after SET(VAR "xyz" CACHE ...) on the other hand is just fine.

Personally, I'd prefer to strictly distinguish between "cached" and
"scoped" variables, i.e. when saying SET(VAR "xyz" CACHE ...), don't
say SET(VAR "abc") and vice versa, or in other words: SET(VAR "xyz")
means not being interested in a cached value of VAR at all - now and
later and for the entire project - and SET(VAR "xyz" CACHE ...) means
not to "misuse" VAR in the current scope as a writable local variable.

If I understand correctly, the whole issue boils down to the fact that
a SET(<var> <val> CACHE <type> <doc>) sometimes writes to the current
scope and sometimes it doesn't, and changing this to make SET() always
write to the current scope would affect SET()'s backward compatibility,
cf. 9a77f65.

With this in mind, there are two spontaneous suggestions for my part:

(1) Introduce a new option to SET(), say "FROM_CACHE", so the command
    SET(VAR "xyz" FROM_CACHE) unconditionally transfers VAR's cached
    value to the current scope; the "xyz" could serve as a fallback
    value if VAR is not yet cached, or VAR becomes unset in that case
    as with SET(VAR) and "xyz" is simply ignored. Maybe, "FROM_CACHE"
    can be provided for the likewise concerned OPTION() command, too.
(2) Introduce a new dereference operator, say "@{}", similar to the
    well-known "${}" but working on the cache instead of the current
    scope. So, SET(VAR @{VAR}) also unconditionally transfers VAR's
    cached value to the current scope, and possibly, it's sufficient
    to restrict "@{}" to the SET() command's realm. Though, OPTION()
    wouldn't benefit in any way.

However, the documentation of SET() and OPTION() should be improved
with respect to SET(<var> <val> CACHE <type> <doc>) and the related
possibility of <var> not receiving a new value in some cases, and
besides, is there a reason for "cmake -DXYZ:STRING=xyz .." and
"cmake -DXYZ=xyz .." to behave differently?

Regards,

Michael