[CMake] providing library information, what's the cmake way

Sun Dec 5 06:48:49 EST 2010

On 12/01/2010 05:57 PM, Johannes Zarl wrote:
> On 12/01/2010 at 16:06, Michael Hertling <mhertling at online.de> wrote: 
>>> FIND_PACKAGE(XXX COMPONENTS YYY)
>>> ...
>>> ADD_SUBDIRECTORY(subdir)
>>> ...
>>> TARGET_LINK_LIBRARIES(AAA ${XXX_LIBRARIES})
>>> TARGET_LINK_LIBRARIES(BBB ${XXX_LIBRARIES} ${XXX_YYY_LIBRARIES})
>>>
>>> In subdir/CMakeLists.txt:
>>>
>>> FIND_PACKAGE(XXX COMPONENTS ZZZ)
>>> ...
>>> TARGET_LINK_LIBRARIES(subBBB ${XXX_LIBRARIES} ${XXX_ZZZ_LIBRARIES})
>>>
>>> As I mentioned above, I would expect XXX_LIBRARIES to contain only the
>>> base library, and find_package calls to act accumulatively.
>>
>> If I understand correctly, you'd like to have a set of variables like
>> XXX_*_{LIBRARIES,INCLUDE_DIRS,...} for each component instead of one
>> comprehensive set XXX_{LIBRARIES,INCLUDE_DIRS,...} for the entire
>> package including the components; this hasn't been obvious to me.
> 
> Correct.

>> The downside of your approach is that you have to mention component-
>> specific variables at various locations: For each component denoted
>> in the FIND_PACKAGE() invocation, you would usually have to mention
>>
>> - XXX_*_INCLUDE_DIRS in INCLUDE_DIRECTORIES()
>> - XXX_*_LIBRARIES in TARGET_LINK_LIBRARIES()
>> - XXX_*_DEFINITIONS in ADD_DEFINITIONS()
> 
> Which is less work than saving all three values for each target that
> I create.

It's not necessary to save these variables for each target. Typically,
it's necessary if there's more than one target in the CMakeLists.txt,
and two or more of them use the same package with a different set of
components; by far, this does not hold for all cases. Besides, when
loading multiple packages for multiple targets, it's not uncommon to
collect libraries in target-specific <target>_LIBRARIES variables as
well as one collects a target's sources in <target>_SOURCES, so the
only remaining overhead is an additional FIND_PACKAGE() call which
is inexpensive:

FIND_PACKAGE(A)
SET(T1_LIBRARIES ${A_LIBRARIES})
SET(T2_LIBRARIES ${A_LIBRARIES})

FIND_PACKAGE(B COMPONENTS B1)
LIST(APPEND T1_LIBRARIES ${B_LIBRARIES})
FIND_PACKAGE(B COMPONENTS B2)  # <-- That's all.
LIST(APPEND T2_LIBRARIES ${B_LIBRARIES})
...
TARGET_LINK_LIBRARIES(T1 ${T1_LIBRARIES})
TARGET_LINK_LIBRARIES(T2 ${T2_LIBRARIES})

A similar point might be applied to XXX_{INCLUDE_DIRS,DEFINITIONS}.

>> in addition to the package's base stuff, i.e. roughly speaking, with n
>> components you need to reference 3+3n variables. 
> 
> These 3+3n variables have to be defined, anyways (the find module needs 
> them internally, if it wants to compose XXX_INCLUDE_DIRS etc.).

Here, it's the user I have in mind, who possibly doesn't want to
fourfold specify each set of components requested from a package.

> Still, referencing the 3+3n variables are less work than having to
> define _and_ reference 3N variables (with N being the number of targets
> in your project). 

The need to save the results of a FIND_PACKAGE() call may come up for
targets within the same CMakeLists.txt file, but not unconditionally
for each of a project's targets. Additionally, this measure is taken
only if it's really necessary whereas your approach compels the user
to always refer to each component four times. Look at one of the
simplest configurations - one target and one package with two
components - the two approaches manifest like:

FIND_PACKAGE(XXX COMPONENTS YYY ZZZ)
ADD_DEFINITIONS(${XXX_DEFINITIONS}
    ${XXX_YYY_DEFINITIONS} ${XXX_ZZZ_DEFINITIONS})
INCLUDE_DIRECTORIES(${XXX_INCLUDE_DIRS}
    ${XXX_YYY_INCLUDE_DIRS} ${XXX_ZZZ_INCLUDE_DIRS})
ADD_EXECUTABLE(...)
TARGET_LINK_LIBRARIES(... ${XXX_LIBRARIES}
    ${XXX_YYY_LIBRARIES} ${XXX_ZZZ_LIBRARIES})

versus

FIND_PACKAGE(XXX COMPONENTS YYY ZZZ)
ADD_DEFINITIONS(${XXX_DEFINITIONS})
INCLUDE_DIRECTORIES(${XXX_INCLUDE_DIRS})
ADD_EXECUTABLE(...)
TARGET_LINK_LIBRARIES(... ${XXX_LIBRARIES})

It's quite obvious which approach means less work, and having only one
target in a CMakeLists.txt is no configuration I would denote as rare.

>> IMO, this thwarts the
>> idea of a multi-component package: Specifying components to adjust the
>> well-known and officially recommended package-related variables like
>> XXX_LIBRARIES. 
> 
> These variables are well-known and officially recommended for component-
> less packages only. Nobody bothered to write recommendations for
> component-packages yet.

${CMAKE_ROOT}/Modules/readme.txt mentions the XXX_YYY_EXECUTABLE and
XXX_YY_{LIBRARY,INCLUDE_DIR} variables explicitly, and IMO, wordings
as "YYY tool that comes with XXX" or "YY library that is part of the
XXX system" allow the general terms "tool" and "part" to be applied
smoothly to the components of a package. Furthermore, the component-
specific variables like XXX_FIND_COMPONENTS are handled also, so I
do believe that this file is relevant for multi-component packages,
too, including the XXX_{LIBRARIES,INCLUDE_DIRS} variables. However,
variables like XXX_YYY_{LIBRARIES,INCLUDE_DIRS} are not mentioned
at all, i.e. there're completely new and should be well founded.

Anyway, the ${CMAKE_ROOT}/Modules/readme.txt doesn't handle multi-
component packages thouroughly, so I would greatly appreciate any
improvements to decrease ambiguities/uncertainties and to reach a
consensus. For this reason, I regret that we haven't seen further
opinions on this topic, in particular because pretty much every
non-trivial package could make good use of a components-aware
find module or config file.

>>> Regarding FIND_PACKAGE_HANDLE_STANDARD_ARGS: why not simply adding
>>> another command? In fact, let's discuss its interface right now, and
>>> then implement it:
>>>
>>> set(XXX_COMPONENTS "YYY;ZZZ")
>>> FIND_PACKAGE_COMPONENTS_HANDLE_STANDARD_ARGS(
>>>    XXX
>>>    XXX_COMPONENTS
>>>    DEFAULT_MSG
>>>    DEFAULT_SUFFIXES
>>> )
>>>
>>> Assuming DEFAULT_SUFFIXES to be "LIBRARIES;INCLUDE_DIRS;DEFINITIONS",
>>> this would check the following variables:
>>> XXX_LIBRARIES
>>> XXX_INCLUDE_DIRS
>>> XXX_DEFINITIONS
>>> XXX_YYY_LIBRARIES
>>> XXX_YYY_INCLUDE_DIRS
>>> XXX_YYY_DEFINITIONS
>>> XXX_ZZZ_LIBRARIES
>>> XXX_ZZZ_INCLUDE_DIRS
>>> XXX_ZZZ_DEFINITIONS
>>>
>>> Let's say that component YYY has been found, but ZZZ is not, so the
>>> XXX_YYY_* variables are set, as well as the XXX_{LIBRARIES,INCLUDE_DIRS,
>>> DEFINITIONS} variables.
>>>
>>> So regardless of the REQUIRED flag you end up with these variable-values:
>>> XXX_FOUND = TRUE
>>> XXX_YYY_FOUND = TRUE
>>> XXX_ZZZ_FOUND = FALSE (or XXX_ZZZ-NOTFOUND, whatever you like most)
>>>
>>> A fatal error is raised, if REQUIRED is true and XXX_FIND_REQUIRED_ZZZ
>> is true:
>>>
>>> find_package(XXX REQUIRED YYY)
>>>  -> OK, user can trust that XXX_FOUND and XXX_YYY_FOUND are true.
>>>
>>> find_package(XXX COMPONENTS ZZZ)
>>>  -> OK, user has to check XXX_FOUND and XXX_ZZZ_FOUND
>>> find_package(XXX REQUIRED ZZZ)
>>>  -> FATAL_ERROR, cmake will abort, so XXX_FOUND=true doesn't matter.
>>>
>>> Is this an interface that you could use? Any criticism?
>>
>> What do you do if an unusually named variable is involved to indicate a
>> component's presence/absence? 
> 
> Can you give an example for this?

Grepping FPHSA in the modules' directory reveals how many different
variable names are used to check a package's availability. Moreover,
suppose YYY is an executable having XXX_YYY_EXECUTABLE only, and ZZZ
is a library with XXX_ZZZ_{LIBRARY,INCLUDE_DIR}. What do you mention
in DEFAULT_SUFFIXES? With "EXECUTABLE;LIBRARY;INCLUDE_DIR", YYY will
lack the library and ZZZ will lack the executable, so your approach
with the Cartesian product of components and suffixes to capture all
variables' names will fail. Instead, you must mention the variables
for each component explicitly, and FPCHSA can't have a signature as
simple as presented above.

>> IMO, instead of developing a completely new function, one should re-use
>> established code as far as possible, and the measure necessary to apply
>> FPHSA to components is just to promote the "standard args" while taking
>> account of the above-mentioned point:
>>
>> # Handle explicitly requested component YYY:
>> # Set up (or not) XXX_YYY_LIBRARY, XXX_YYY_INCLUDE_DIR, ...
>> SET(XXX_YYY_FIND_REQUIRED ${XXX_FIND_REQUIRED})
>> SET(XXX_YYY_FIND_QUIETLY ${XXX_FIND_QUIETLY})
>> FPHSA(XXX_YYY DEFAULT_MSG XXX_YYY_LIBRARY XXX_YYY_INCLUDE_DIR ...)
>>
>> # Handle non-requested component ZZZ:
>> # Set up (or not) XXX_ZZZ_LIBRARY, XXX_ZZZ_INCLUDE_DIR, ...
>> SET(XXX_ZZZ_FIND_REQUIRED FALSE)
>> SET(XXX_ZZZ_FIND_QUIETLY TRUE)
>> FPHSA(XXX_ZZZ DEFAULT_MSG XXX_ZZZ_LIBRARY XXX_ZZZ_INCLUDE_DIR ...)
> 
> The problem is that FPHSA simply wasn't created with components in mind.
> As worthwhile as code-reuse is, it's not a goal in itself. Rather than
> jump through hoops in order to reuse the existing interface, IMO one 
> should create a good interface for the more complex use-case at hand.

To apply FPHSA to a component and, thus, re-use existing code, I just
need to define two variables; this is not "jumping through hoops".

> Back to my proposed FPCHSA: My initial goal was to provide an interface
> which does as much work as possible for you, maybe at the price of
> restricted variable naming. So let's come up with a better interface:
> 
> I do want to restrict the possible prefix for modules, because I really
> do think it is a good practice (and a practice worth enforcing) to
> require a common package prefix and one prefix for each component):
> 
> FPCHSA( XXX DEFAULT_MSG LIBRARY INCLUDE_DIR DEFINITIONS
>   COMPONENT YYY DEFAULT_MSG LIBRARY INCLUDE_DIR DEFINITIONS
>   COMPONENT ZZZ DEFAULT_MSG INCLUDE_DIR OTHER_VAR )
> 
> This is far noisier than the initial approach, but still much easier to
> read and more compact to write that the FPHSA approach outlined above
> (3 lines vs. 7 lines). Even without the namespace-enforcement it would 
> be readable: 
> 
> FPCHSA( XXX DEFAULT_MSG XXX_LIBRARY XXX_INCLUDE_DIR XXX_DEFINITIONS
>   COMPONENT YYY DEFAULT_MSG XXX_YYY_LIBRARY XXX_YYY_INCLUDE_DIR XXX_YYY_DEFINITIONS
>   COMPONENT ZZZ DEFAULT_MSG XXX_YYY_INCLUDE_DIR XXX_YYY_OTHER_VAR )
> 
> This signature should even be downwards-compatible with FPHSA.

Suppose you have components YYY and ZZZ with ZZZ depending on YYY. When
handling ZZZ in the find module, there's a certain probability that you
must access the YYY stuff, so one would like to have the XXX_YYY_FOUND
variable already defined at this time. With the FPCHSA, the components'
FOUND variables are defined in one go at the module's end; hence, one
has to check YYY's availability by oneself when it's needed for ZZZ,
and FPCHSA will do the same work once more later. Instead, IMO, it's
convenient to check a component's presence right after the relevant
variables have received their values, and for this purpose, FPHSA
is entirely sufficient.

>>>>>> - Interpretation of XXX_FOUND: Config files can't set the value of this
>>>>>> variable, but find modules can, so one should think about what it means
>>>>>> for XXX_FOUND if a component - requested or not - hasn't been found.
>>>>>
>>>>> As I have written above, I don't think that the components should alter
>>>>> the values of the package-wide variables (XXX_LIBRARIES etc.). The same
>>>>> applies to the XXX_FOUND variable. If you search for library XXX and
>>>>> component YYY, XXX is still found even if it lacks the requested
>> component.
>>>>> If you want to know if XXX was found, you use XXX_FOUND, if you want
>> the
>>>>> same for component YYY, you use XXX_YYY_FOUND.
>>>>
>>>> While this interpretation of XXX_FOUND is absolutely right, IMO, the
>>>> intention of XXX_LIBRARIES etc. is different as I said before: These
>>>> variables should comprehend all stuff necessary to use all requested
>>>> components. E.g., suppose YYY needs zlib; where are you going to put
>>>> ZLIB_LIBRARY from the FIND_PACKAGE(ZLIB) invocation? XXX_YYY_LIBRARY
>>>> is reserved for libXXX_YYY.so or the like only. It's XXX_LIBRARIES
>>>> that takes XXX_YYY_LIBRARY along with ZLIB_LIBRARY and the other
>>>> components' libraries and prerequisites, so
>>>
>>> Just like XXX_LIBRARIES for package XXX, you should have XXX_YYY_LIBRARIES
>>> for component YYY. XXX_YYY_LIBRARIES contains both XXX_YYY_LIBARY and
>>> ZLIB_LIBRARIES. A user of the package XXX should never directly use
>>> XXX_YYY_LIBRARY, just like he or she should never use XXX_LIBRARY.
>>
>> You could handle the components' prerequisites in this manner, but how
>> do you cope with inter-component dependencies? Suppose ZZZ depends on
>> YYY and the user says:
>>
>> FIND_PACKAGE(XXX COMPONENTS ZZZ)
>> TARGET_LINK_LIBRARIES(... ${XXX_ZZZ_LIBRARIES})
>>
>> Where do you put the YYY variables? Do you add them to the ZZZ ones? 
> 
> I'd add them to the ZZZ ones, because they are needed to use ZZZ. In 
> contrast, I don't add them to XXX, because XXX can be used just fine
> without ZZZ.
> 
> My approach just has one rule: "If c1 needs c2, add c2 to c1."
> 
> Thus, inter-package dependencies and inter-component dependencies work
> exactly the same way.
> 
>> If
>> so, what do you do in the case of FIND_PACKAGE(XXX COMPONENTS YYY ZZZ)?
>> Do you still add the YYY variables to ZZZ's ones, or are you willing to
>> differentiate between these two situations? 
> 
> Yes, I still add them. Adding any "magic" at this point would rightly
> confuse any user. These two are fully equivalent:
> 
> find_package( XXX COMPONENTS YYY ZZZ)
> target_link_libraries(foo ${YYY_LIBRARIES})
> target_link_libraries(bar ${ZZZ_LIBRARIES})
> 
> and:
> 
> find_package( XXX COMPONENTS YYY)
> target_link_libraries(foo ${YYY_LIBRARIES})
> find_package( XXX COMPONENTS ZZZ)
> target_link_libraries(bar ${ZZZ_LIBRARIES})
> 
> At this point, I just realised that I was being inconsistent. After all, 
> the components do depend on the whole package. So YYY_LIBRARIES should
> also contain XXX_LIBRARIES.

If I understand correctly, you intend to populate XXX_YYY_LIBRARIES,
e.g., with its own value XXX_YYY_LIBRARY *and all* its prerequisite
components' values, too? If so, you will end up with many repeatedly
mentioned libraries on the link command line unless there're hardly
any inter-component dependencies. Suppose ZZZ depends on YYY, so

FIND_PACKAGE(XXX COMPONENTS YYY ZZZ)
...
TARGET_LINK_LINK_LIBRARIES(...
    ${XXX_LIBRARIES} ${XXX_YYY_LIBRARIES} ${XXX_ZZZ_LIBRARIES}
)

will mention libXXX_ZZZ once, libXXX_YYY twice and libXXX three times,
even if you remove duplicates. A third component depending on ZZZ has
four libraries, a fourth will have five etc. I.e., in an inauspicious
configuration, the number of libraries in XXX_[*_]LIBRARIES tends to
grow by O(n^2) w.r.t. the number n of components - for each package.
The penalty of unnecessarily repeated libraries in the link command
is measurable at least, and I'm not sure it won't become perceptible
for larger projects. So, I'd prefer a carefully set up comprehensive
XXX_LIBRARIES variable for the whole package including its components.

>> Is this still suitable with
>> numerous inter-component dependencies? With single XXX_{LIBRARIES,...}
>> variables destined for all components, there are no issues like those.
> 
> You mean if the user doesn't know that ZZZ depends on YYY, he might
> add both to the link libraries, therefore adding unneeded stuff to
> the linker command? I see this as less of a problem than the overlinking
> issue in the other case.

In general, from a user's perspective, I don't want to wonder if ZZZ
depends on YYY so I could possibly drop the latter from the list of
components although I need to use it. Furthermore, while the effect
of unnecessarily mentioned libraries during link phases is mostly
negligible, I'd bear it in mind. Finally, overlinking due to any
overstuffed XXX_LIBRARIES variable can be easily avoided by an
additional FIND_PACKAGE() call which is cheap to have.

In order not to be misunderstood: I see your intention, and I see the
advantages you want to achieve, but I think the disadvantages prevail:

- More work for the user due to component-specific sets of variables.
- Unnecessary repetition of libraries on the link command line.
- Inconvenient centralization of the FOUND variables' setup.

In addition, when looking at FPHSA's current implementation, I've doubts
whether a component-related extension should be added to this somewhat
non-trivial function, in particular since it's already applicable to
components, but of course, feel free to specify and implement an
improved version and bring it up for discussion.

Regards,

Michael

PS: Could you prevent your signature from sometimes appearing midway
    through your posts? This confuses my mail client when replying.