[CMake] Interest in adding to CMake the CMakeGet module to get dependencies

P F pfultz2 at yahoo.com
Sat Aug 5 23:29:25 EDT 2017


> On Aug 4, 2017, at 9:18 PM, Craig Scott <craig.scott at crascit.com> wrote:
> 
> TLDR: There may be other work already covering similar capabilities (e.g. hunter and WIP that I'm uncloaking here). This response is mostly about providing broader discussion.
> 
> 
> Paul it seems you've been busy since you first asked about cget <https://cmake.org/pipermail/cmake/2016-March/062994.html> on this list last year. ;)

Thanks, cget has improved a lot in the last year.

> 
> In the interests of disclosure, this latest email of yours has prompted me to uncloak some of my own work which may be relevant. This work has been in development and testing on real world projects for the past 2 years or so. Without wanting to hijack your thread, let me give a little info about it and then see my responses to your comments at the end for how/why I think this may be relevant to your request for comments.
> 
> ExternalProject can be used in a different way to allow it to perform downloads at configure time rather than build time. You can see a relatively straightforward implementation of this here:
> 
>     https://github.com/Crascit/DownloadProject <https://github.com/Crascit/DownloadProject>
> 

That seems to support the `add_subdirectory` workflow, but how would it work when using the `find_package` workflow? This workflow is useful to help package manager tools(including those which pulls binaries).

> This has the advantage of leveraging all of the various download methods ExternalProject already provides, meaning they are already fairly widely used and tested. Any improvements to ExternalProject also come for free and documentation for the various methods is also provided (I recently overhauled the ExternalProject docs and these will appear in the 3.10.0 release, but they can be viewed in the master docs <https://cmake.org/cmake/help/git-master/module/ExternalProject.html> in the meantime). ExternalProject also has the advantage that it supports updating after the initial download in the case of repository sources like git, svn, etc.
> 
> The uncloaking part of my response is that I have been working on a full dependency download framework over the past year and half which is built on top of the above DownloadProject implementation. It supports hierarchical dependencies across complex project structures and makes it very easy for projects to pull in other projects, including support for higher level projects being able to override dependency details of projects lower in the dependency tree if they want to. I've been incubating this privately in a real world company environment to iron out the kinks and ensure the interface that projects interact with supports the right set of features and that less knowledgable users can understand it, etc. I'm planning on putting both DownloadProject and the dependency framework up for review to merge into CMake within the next few months, all going well. Whereas your work seems to focus on building and/or re-using an installed/binary result, my work focuses more on making external projects part of your main build. This has advantages like using the same compiler settings, toolchain details, etc. and all of the dependency targets are automatically available to the rest of the project if the dependency uses CMake as its build system. I'll postpone further details on it until it is ready for review, but that should be enough for context for my responses to your comments further below.
> 
> There's also hunter <https://github.com/ruslo/hunter>, which you've already been made aware of and which is closer to your proposal in terms of the set of problems it tries to solve. Hunter has been around for a while, is reasonably well known and on the face of it, already seems to do all the things cmake-get is trying to do. It is also based on ExternalProject. It may be helpful if you could show how your work differs from what hunter already provides so that its value to the CMake community is clear.
> 
> Hopefully that's not too overwhelming for background/context. See my responses in the remainder below.
> 
> 
> 
> On Sat, Aug 5, 2017 at 5:57 AM, paul via CMake <cmake at cmake.org <mailto:cmake at cmake.org>> wrote:
> Hi,
> 
> I have a written a cmake module to get dependencies using the cget protocol
> here:
> 
> https://github.com/pfultz2/cmake-get <https://github.com/pfultz2/cmake-get>
> 
> This is different than `ExternelProject`:
> 
> * `ExternelProject` happens only during build, which allows this module to
> work in both in config and script mode. 
> 
> As detailed above, you can use ExternalProject at configure time too and there's already a relatively simple, generic implementation available showing how to do it for the download case.
> 
>  
> * In config mode, the user can just do:
> 
> cmake_get(<pkg> PREFIX ${CMAKE_CURRENT_BINARY_DIR}/deps)
> find_package(<pkg> HINTS ${CMAKE_CURRENT_BINARY_DIR}/deps)
> 
> Since the dependency is available at config time.
> 
> This is a little naive. What if there is a complex hierarchy of projects and more than one wants to use a particular dependency, but they both want to specify different versions or repository details? How would that be resolved?

There is no SAT solving, and having package manager with this would be useful. The focus of this is too just enable the dependencies for a given project.

> The "get" functionality looks quite limited, both in terms of the supported methods and the ability to specify method-specific details. ExternalProject already has pretty good coverage for this. In the case of multiple projects wanting different dependency details, you need to decide how to choose one project's details over another, or if you want to allow both to be downloaded/built then how to prevent them from resulting in inconsistencies or clashes (e.g. if there's "A links to B", you really don't want to have to ask "which B?").
> 
> The find_package() call would also seem to rely on the dependent project implementing install(EXPORT) rules for all the things you want to use from it, or at least providing <pkg>Config.cmake files or their equivalent to make targets, etc. available. Many projects will do this, but quite a few won't. You might be able to use find_library(), find_program() and find_file() in such cases

Yes, in the example I use `find_package`, but if that isn’t available, then you will want to use cmake’s other `find_*` commands, which helps make your library package-manager friendly.

> , but these all require the dependency to have already been built. You generally don't want things building during the config stage if you can avoid it, as a slow config stage is pretty annoying as a developer. If I'm understanding your implementation correctly, it builds the dependency if it hasn't previously, which may make the config stage very slow when it can't used a cached build from earlier.

Yes, the config can be slow. The other problem is the toolchain is not transitive. So if the user has a special toolchain, it won’t invoke cmake the same way. This is why using a separate script is preferred.

The purpose of this is to provide a way to get the dependencies when there is no dependency management tool. Perhaps, the user doesn’t want to install a dependency management tool or they don’t want to install each dependency manual. So this can help, but at the same time a user maybe using their own dependency management tool so it important not to interfere with the `find_*` commands in cmake. That why in config mode the user can disable getting dependencies with `BUILD_DEPS=Off`.

> I believe this is more or less what hunter does too (Ruslo, feel free to clarify/correct/call-me-nuts here!). Compare this with the approach my framework is taking where the build is still deferred to the build stage, only the download happens at config time.

That is because it uses `add_subdirectory`, which is not very package manager-friendly. Because if someone wants to install the dependencies themselves, it will download and build them again, which may not be want the user wants. This can be made to work but it requires creating a superproject which contains the project+dependencies. 

> BTW, if you really need the external project built at config time, this stackoverflow question and answer <https://stackoverflow.com/q/36084785/1938798> may also be of interest (mentioned more for the benefit of others than for this discussion around cmake-get).

I dont see that using ExternalProject.

> 
>  
> * A script can be easily written to download the dependencies, which is much
> better approach to deal with toolchain transitivity, which is a problem that
> exists with both `ExternelProject` and config mode.
> 
> You may want to consider separating out the downloading from the building. What I've found is that there are situations where all you want is to download some content at config time but perhaps not build it. Potentially you could just download and unpack a pre-built tarball directly, so there may be no need to build at all. Conversely, if you download the sources and make the external dependency part of your main project's build, then the whole toolchain, compiler settings, etc. problem largely goes away (if the dependency project uses CMake and doesn't do things which assume it is the top level of the build). 

You can install binaries with `cmake_get(<pkg> PREFIX ${CMAKE_CURRENT_BINARY_DIR}/deps CMAKE_FILE binary)`.

> 
> There's also the matter of if you script something, what scripting language do you use? For cross-platform projects, unix shell scripts aren't always supported. Cross-platform languages like python come with their own dependencies (i.e. the python interpreter). You may find that you end up using cmake as your scripting language, since that's about the only common one that will work everywhere without requiring any further dependencies.

Thats what cget basically does, it uses cmake as it scripting language.

> Now you may find that the commands you would have put in such a cmake script could have been made part of the main project to begin with (or made available as a module).

Which is why cget(or cmake-get) can install the dependencies by default by directly invoking `cmake` without any extra scripting. Of course, build scripts shouldn’t always build its dependencies as that would interfere with a dependency management tool.

> Again, this starts pointing back to building the project as part of the main build, especially when the toolchain and compiler flag consistency advantages are taken into consideration.

A dependency management tool will take care of invoking every package with the same toolchain settings. Cmake also makes this easy with toolchain files.

> 
>  
> * Recipes can be reused for building projects that don't use cmake, or don't
> follow the standard cmake installation flow. Furthermore, an existing cget
> recipe can be used as well.
> 
> This can be very useful for publicly accessible projects, especially for popular ones. I see that you've already got a collection of recipes for a variety of projects. I haven't looked deeply at your implementation, but I take it that it would support users working with private repos using their own private recipe stores too?

Setting up a repo with private recipes is easy. It just a matter of pointing it to the private URL.

> I believe hunter offers similar functionality to this as well, but I could be wrong. Ruslo?
> 
> Even for the approach I'm working on, non-CMake projects can be handled fairly easily with existing ExternalProject functionality. If I need files from the dependency at config time (e.g. headers so I can query for version details), I can use DownloadProject to download the source at configure time and then point ExternalProject at the downloaded source rather than specifying a download method and have ExternalProject trigger the relevant build commands (e.g. a Makefile-based project). Again, this leverages functionality already provided by CMake rather than reinventing the wheel formulating my own logic to manage external build steps.
> 
>  
> Is there interest in this project? Of course, the protocol for getting
> dependencies could be tweaked as based on feedback. I am the author of both
> this module and cget so it is possible to update cget to match feedback from
> the larger cmake community.
> 
> My own experience is that there is quite a bit of interest among CMake users to have some sort of easy to use mechanism for handling project dependencies (which is in part why I'm intending to put up a potential solution soon myself). It starts getting hard when you consider that for a particular dependency, different projects may want to build it with different settings, flags, compilers, etc. which means you need to manage not only "A depends on B", but also "A depends on B built this way”.

Yes, a SAT solver would be needed to solve these types of problem. For now, cget just uses a first come, first serve, which pushes the issue on to the user to solve. 

> The approach I'm taking is to make download easy, including specifying precisely the repository details and download methods of the dependency, but put the build of that dependency in the hands of the project.

Yes, your dependency management is built around the `add_subdirectory` workflow, whereas cget using the `find_package` workflow.

> What you've encapsulated as "recipes" can be provided separately either as a collection of modules or whatever other way ends up being convenient and used once the dependency has been downloaded.

The recipe encapsulates the URL to download as well.

> Thus, the download part can be captured fairly generically and easy-to-use functionality be provided to facilitate it, but building is a different animal. From what I understand of your work, it seems like the two have been more closely coupled in a similar way to hunter, although the implementations are obviously very different. Again, apologies to yourself and Ruslo if I've misunderstood either one.

The implementation is different, but I wonder if there is room for interoperability between hunter and cget.

> 
> So yes, there's definitely interest and to be honest, there isn't going to be one solution to rule them all.

No, there won’t be one solution, nor will there be a universal package manager for C/C++. This why it is important to have dependency management tool that can interoperate with other package managers, and build scripts which are orthogonal to the package manager.

> There's a place for the different approaches (I see hunter and my own work as solving related but different problems, for example), but my purpose here is to provide broader information for further consideration and discussion.

Yes, but when you want to install a dependency, it would be could to have some specification of how to get and install that dependency very likely built around cmake. Cget provides a simple protocol for this, but could possibly be tweaked for better flexibility.

> Hopefully I haven't poured too much cold water on an otherwise very interesting area!

Thanks for the reply.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cmake/attachments/20170805/2b2d0fdc/attachment-0001.html>


More information about the CMake mailing list