Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] set ${pkg}_DIR for project dependencies to assist with building dependencies in a common build dir #1

Open
trxcllnt opened this issue May 7, 2021 · 2 comments
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request

Comments

@trxcllnt
Copy link
Contributor

trxcllnt commented May 7, 2021

In node-rapids we're taking advantage of a centralized source and build cache for all our C++ dependencies. We have multiple CMakeLists.txt project files that depend on certain expensive-to-build RAPIDS libraries like libcudf, and we don't want to build those libraries multiple times.

CPM has a utility for telling it to centralize where it downloads dependency sources via CPM_SOURCE_CACHE, but it has no corresponding setting for centralizing dependency build dirs i.e. CPM_BINARY_CACHE.

We can trick CPM (everything else that uses FetchContent) into centralizing build dirs via the FETCHCONTENT_BASE_DIR variable.

However, we later run into issues when a built project's export targets go to look up its own dependencies via FindPackage(). The dependencies aren't installed, and they're not in a location FindPackage() knows to search.

And since FetchContent doesn't create directory names FindPackage recognizes as locations a package could be (i.e. thrust vs. thrust-build), we can't simply do list(APPEND CMAKE_PREFIX_PATH ${FETCHCONTENT_BASE_DIR}).

Our workaround for this is to set ${pkg}_DIR for a project's dependencies before calling CPMFindPackage(). We have this little CMake function to help us:

function(_set_package_dir_if_exists pkg dir)
  if (EXISTS "${CPM_BINARY_CACHE}/${dir}-build")
      set(${pkg}_DIR "${CPM_BINARY_CACHE}/${dir}-build" PARENT_SCOPE)
  endif()
endfunction()
@trxcllnt trxcllnt added feature request New feature or request ? - Needs Triage Need team to review and classify labels May 7, 2021
@robertmaynard robertmaynard added 1 - On Deck To be worked on next and removed ? - Needs Triage Need team to review and classify labels May 7, 2021
@robertmaynard robertmaynard self-assigned this May 10, 2021
robertmaynard added a commit to robertmaynard/rapids-cmake that referenced this issue May 14, 2021
@robertmaynard robertmaynard removed the 1 - On Deck To be worked on next label May 18, 2021
@robertmaynard
Copy link
Contributor

The primary issue holding this back is that this could break consuming projects when a cache is shared between multiple consumers.

For example consider a cache where dependency generates a shared library. Consumer A builds and links to version 1, Consumer B of the cache comes around and builds a new SHA1 version of dependency but with the same version number.
Running A will most likely now break :(

I think that for this to work safely it needs to be explicit on the consuming side to opt into the cache. The dependencies should generate / encode the location but shouldn't set it unless some requested flag is set by the consumer.

@trxcllnt Did I capture the issues correctly?

@trxcllnt
Copy link
Contributor Author

Consumer B of the cache comes around and builds a new SHA1 version of dependency but with the same version number.

Should note for completeness -- this would happen even if the version numbers are different.

To support this, CPM (or maybe CMake's FetchContent?) would need to include the git hash in the path passed to ExternalProject_Add(BINARY_DIR). If two projects declare a dependency on separate hashes/versions of a project, they should be built into separate locations.

For example, here's the CMakeLists.txt automatically generated when CPMFindPackage(arrow) downloads and builds Arrow:

include(ExternalProject)
ExternalProject_Add(arrow-populate
                     "UPDATE_DISCONNECTED" "False" "GIT_REPOSITORY" "https://github.com/apache/arrow.git" "GIT_TAG" "apache-arrow-1.0.1" "GIT_SHALLOW" "TRUE"
                    SOURCE_DIR          "<root>/.cache/cpm/arrow/8e5cd63fd71d7269acd5d779a6767e97ede977f5"
                    BINARY_DIR          "<root>/.cache/build/Release/arrow-build"
                    CONFIGURE_COMMAND   ""
                    BUILD_COMMAND       ""
                    INSTALL_COMMAND     ""
                    TEST_COMMAND        ""
                    USES_TERMINAL_DOWNLOAD  YES
                    USES_TERMINAL_UPDATE    YES
)

In the above case, BINARY_DIR should probably include the git hash as the SOURCE_DIR does.

robertmaynard added a commit to robertmaynard/rapids-cmake that referenced this issue Jul 29, 2021
To properly test `cpm` packages in rapids-cmake we need to fix
two major issues.

 1. A way to have tests that depend on the same CPM package
    not execute at the same time. This is required so we don't
    double checkout a project

 2. A way to checkout and build projects such as RMM that themselves
    use rapids-cmake without issue.

Number rapidsai#1 is solved by the introduction of the `SERIAL` keyword

Number rapidsai#2 is solved for most tests by having the `project_template`
that `.cmake` tests use get `rapids-cmake` via FETCH_CONENT. This
will allow us to verify local rapids-cmake changes against existing
projects that use rapids-cmake
rapids-bot bot pushed a commit that referenced this issue Aug 2, 2021
To properly test `cpm` packages in rapids-cmake we need to fix
two major issues.

 1. A way to have tests that depend on the same CPM package
    not execute at the same time. This is required so we don't
    double checkout a project

 2. A way to checkout and build projects such as RMM that themselves
    use rapids-cmake without issue.

Number #1 is solved by the introduction of the `SERIAL` keyword

Number #2 is solved for most tests by having the `project_template`
that `.cmake` tests use get `rapids-cmake` via FETCH_CONENT. This
will allow us to verify local rapids-cmake changes against existing
projects that use rapids-cmake

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers: None

URL: #48
@robertmaynard robertmaynard added the 0 - Backlog In queue waiting for assignment label Oct 20, 2021
@robertmaynard robertmaynard removed their assignment Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants