[CMake] Interest in adding to CMake the CMakeGet module to get dependencies

Craig Scott craig.scott at crascit.com
Fri Aug 4 22:18:27 EDT 2017


TLDR: There may be other work already covering similar capabilities (e.g.
hunter and WIP that I'm uncloaking here). This response is mostly about
providing broader discussion.


Paul it seems you've been busy since you first asked about cget
<https://cmake.org/pipermail/cmake/2016-March/062994.html> on this list
last year. ;)

In the interests of disclosure, this latest email of yours has prompted me
to uncloak some of my own work which may be relevant. This work has been in
development and testing on real world projects for the past 2 years or so.
Without wanting to hijack your thread, let me give a little info about it
and then see my responses to your comments at the end for how/why I think
this may be relevant to your request for comments.

ExternalProject can be used in a different way to allow it to perform
downloads at configure time rather than build time. You can see a
relatively straightforward implementation of this here:

    https://github.com/Crascit/DownloadProject

This has the advantage of leveraging all of the various download methods
ExternalProject already provides, meaning they are already fairly widely
used and tested. Any improvements to ExternalProject also come for free and
documentation for the various methods is also provided (I recently
overhauled the ExternalProject docs and these will appear in the 3.10.0
release, but they can be viewed in the master docs
<https://cmake.org/cmake/help/git-master/module/ExternalProject.html> in
the meantime). ExternalProject also has the advantage that it supports
updating after the initial download in the case of repository sources like
git, svn, etc.

The uncloaking part of my response is that I have been working on a full
dependency download framework over the past year and half which is built on
top of the above DownloadProject implementation. It supports hierarchical
dependencies across complex project structures and makes it very easy for
projects to pull in other projects, including support for higher level
projects being able to override dependency details of projects lower in the
dependency tree if they want to. I've been incubating this privately in a
real world company environment to iron out the kinks and ensure the
interface that projects interact with supports the right set of features
and that less knowledgable users can understand it, etc. I'm planning on
putting both DownloadProject and the dependency framework up for review to
merge into CMake within the next few months, all going well. Whereas your
work seems to focus on building and/or re-using an installed/binary result,
my work focuses more on making external projects part of your main build.
This has advantages like using the same compiler settings, toolchain
details, etc. and all of the dependency targets are automatically available
to the rest of the project if the dependency uses CMake as its build
system. I'll postpone further details on it until it is ready for review,
but that should be enough for context for my responses to your comments
further below.

There's also hunter <https://github.com/ruslo/hunter>, which you've already
been made aware of and which is closer to your proposal in terms of the set
of problems it tries to solve. Hunter has been around for a while, is
reasonably well known and on the face of it, already seems to do all the
things cmake-get is trying to do. It is also based on ExternalProject. It
may be helpful if you could show how your work differs from what hunter
already provides so that its value to the CMake community is clear.

Hopefully that's not too overwhelming for background/context. See my
responses in the remainder below.



On Sat, Aug 5, 2017 at 5:57 AM, paul via CMake <cmake at cmake.org> wrote:

> Hi,
>
> I have a written a cmake module to get dependencies using the cget protocol
> here:
>
> https://github.com/pfultz2/cmake-get
>
> This is different than `ExternelProject`:
>
> * `ExternelProject` happens only during build, which allows this module to
> work in both in config and script mode.
>

As detailed above, you can use ExternalProject at configure time too and
there's already a relatively simple, generic implementation available
showing how to do it for the download case.



> * In config mode, the user can just do:
>
> cmake_get(<pkg> PREFIX ${CMAKE_CURRENT_BINARY_DIR}/deps)
> find_package(<pkg> HINTS ${CMAKE_CURRENT_BINARY_DIR}/deps)
>
> Since the dependency is available at config time.
>

This is a little naive. What if there is a complex hierarchy of projects
and more than one wants to use a particular dependency, but they both want
to specify different versions or repository details? How would that be
resolved? The "get" functionality looks quite limited, both in terms of the
supported methods and the ability to specify method-specific details.
ExternalProject already has pretty good coverage for this. In the case of
multiple projects wanting different dependency details, you need to decide
how to choose one project's details over another, or if you want to allow
both to be downloaded/built then how to prevent them from resulting in
inconsistencies or clashes (e.g. if there's "A links to B", you really
don't want to have to ask "which B?").

The find_package() call would also seem to rely on the dependent project
implementing install(EXPORT) rules for all the things you want to use from
it, or at least providing <pkg>Config.cmake files or their equivalent to
make targets, etc. available. Many projects will do this, but quite a few
won't. You might be able to use find_library(), find_program() and
find_file() in such cases, but these all require the dependency to have
already been built. You generally don't want things building during the
config stage if you can avoid it, as a slow config stage is pretty annoying
as a developer. If I'm understanding your implementation correctly, it
builds the dependency if it hasn't previously, which may make the config
stage very slow when it can't used a cached build from earlier. I believe
this is more or less what hunter does too (Ruslo, feel free to
clarify/correct/call-me-nuts here!). Compare this with the approach my
framework is taking where the build is still deferred to the build stage,
only the download happens at config time. BTW, if you really need the
external project built at config time, this stackoverflow question and
answer <https://stackoverflow.com/q/36084785/1938798> may also be of
interest (mentioned more for the benefit of others than for this discussion
around cmake-get).



> * A script can be easily written to download the dependencies, which is
> much
> better approach to deal with toolchain transitivity, which is a problem
> that
> exists with both `ExternelProject` and config mode.
>

You may want to consider separating out the downloading from the building.
What I've found is that there are situations where all you want is to
download some content at config time but perhaps not build it. Potentially
you could just download and unpack a pre-built tarball directly, so there
may be no need to build at all. Conversely, if you download the sources and
make the external dependency part of your main project's build, then the
whole toolchain, compiler settings, etc. problem largely goes away (if the
dependency project uses CMake and doesn't do things which assume it is the
top level of the build).

There's also the matter of if you script something, what scripting language
do you use? For cross-platform projects, unix shell scripts aren't always
supported. Cross-platform languages like python come with their own
dependencies (i.e. the python interpreter). You may find that you end up
using cmake as your scripting language, since that's about the only common
one that will work everywhere without requiring any further dependencies.
Now you may find that the commands you would have put in such a cmake
script could have been made part of the main project to begin with (or made
available as a module). Again, this starts pointing back to building the
project as part of the main build, especially when the toolchain and
compiler flag consistency advantages are taken into consideration.



> * Recipes can be reused for building projects that don't use cmake, or
> don't
> follow the standard cmake installation flow. Furthermore, an existing cget
> recipe can be used as well.
>

This can be very useful for publicly accessible projects, especially for
popular ones. I see that you've already got a collection of recipes for a
variety of projects. I haven't looked deeply at your implementation, but I
take it that it would support users working with private repos using their
own private recipe stores too? I believe hunter offers similar
functionality to this as well, but I could be wrong. Ruslo?

Even for the approach I'm working on, non-CMake projects can be handled
fairly easily with existing ExternalProject functionality. If I need files
from the dependency at config time (e.g. headers so I can query for version
details), I can use DownloadProject to download the source at configure
time and then point ExternalProject at the downloaded source rather than
specifying a download method and have ExternalProject trigger the relevant
build commands (e.g. a Makefile-based project). Again, this leverages
functionality already provided by CMake rather than reinventing the wheel
formulating my own logic to manage external build steps.



> Is there interest in this project? Of course, the protocol for getting
> dependencies could be tweaked as based on feedback. I am the author of both
> this module and cget so it is possible to update cget to match feedback
> from
> the larger cmake community.
>

My own experience is that there is quite a bit of interest among CMake
users to have some sort of easy to use mechanism for handling project
dependencies (which is in part why I'm intending to put up a potential
solution soon myself). It starts getting hard when you consider that for a
particular dependency, different projects may want to build it with
different settings, flags, compilers, etc. which means you need to manage
not only "A depends on B", but also "A depends on B built this way". The
approach I'm taking is to make download easy, including specifying
precisely the repository details and download methods of the dependency,
but put the build of that dependency in the hands of the project. What
you've encapsulated as "recipes" can be provided separately either as a
collection of modules or whatever other way ends up being convenient and
used once the dependency has been downloaded. Thus, the download part can
be captured fairly generically and easy-to-use functionality be provided to
facilitate it, but building is a different animal. From what I understand
of your work, it seems like the two have been more closely coupled in a
similar way to hunter, although the implementations are obviously very
different. Again, apologies to yourself and Ruslo if I've misunderstood
either one.

So yes, there's definitely interest and to be honest, there isn't going to
be one solution to rule them all. There's a place for the different
approaches (I see hunter and my own work as solving related but different
problems, for example), but my purpose here is to provide broader
information for further consideration and discussion. Hopefully I haven't
poured too much cold water on an otherwise very interesting area!


-- 
Craig Scott
Melbourne, Australia
https://crascit.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cmake/attachments/20170805/0b930c20/attachment.html>


More information about the CMake mailing list