Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Rework CMake search path settings #880

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

LecrisUT
Copy link
Collaborator

@LecrisUT LecrisUT commented Aug 29, 2024

  • Add a settings container for the relevant CMAKE_PREFIX_*, *_ROOT, etc. search settings and overrides
    • Naming and where to put. It might make more sense to be under cmake rather than at the top
  • Gate the install path's site-packages by an option
  • Gate the isolated build directory's install path by an option
    I suspect this is related to if site_packages != DIR.parent.parent:?
  • Add logic for entry_points(group="cmake.root"):
    • Use the key as the name for the <Pkg>_ROOT
    • Fail if more than one module tries to define the same <Pkg>
    • Should we support list of roots?
  • Add dict options mirroring the entry-points with higher priority
    This allows to override the entry-points, e.g. setting to "" to delete that entry
  • Update the schema
  • Expand the paths with variables for site-packages or others, e.g. config option could have {platlib}/other_project
    Not sure how to do this one.
  • Documentation
    • Add a new page on all the search logics
    • Encourage the usage of cmake.root over cmake.prefix (cannot deprecate because some consumers might not use find_package, e.g. swig)
  • Tests

Closes #831
Closes #885
Relates to #860 (still keeping it open because it doesn't address the specific issue)

Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
@LecrisUT LecrisUT changed the title feat: Rework CMake prefix calculations feat: Rework CMake search path settings Aug 29, 2024
Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
Copy link

codecov bot commented Aug 29, 2024

Codecov Report

Attention: Patch coverage is 67.14286% with 23 lines in your changes missing coverage. Please review.

Project coverage is 83.62%. Comparing base (1124da7) to head (836c51e).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/scikit_build_core/builder/builder.py 53.65% 19 Missing ⚠️
src/scikit_build_core/cmake.py 33.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #880      +/-   ##
==========================================
- Coverage   83.90%   83.62%   -0.29%     
==========================================
  Files          74       75       +1     
  Lines        4387     4451      +64     
==========================================
+ Hits         3681     3722      +41     
- Misses        706      729      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
@LecrisUT LecrisUT marked this pull request as ready for review August 29, 2024 14:57
@LecrisUT LecrisUT requested a review from henryiii August 29, 2024 14:57
This variable is populated from the entry-point `cmake.module` and the option
`search.modules` similar to [`CMAKE_PREFIX_PATH`] section.

[`CMAKE_PREFIX_PATH`]: #cmake-prefix-path
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link does indeed work, but the CI is reporting as not working 🤷

Comment on lines 10 to 18
if sys.version_info < (3, 10):
# Readers were introduced in 3.10
from importlib_resources import readers
elif sys.version_info < (3, 11):
# In 3.10 readers were implemented in importlib.readers
from importlib import readers
else:
# From 3.11 it is accessed as importlib.resources.readers
from importlib.resources import readers
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is rather tricky since this can be expanded into 3 possibilities:

  • importlib_resources.MultiplexedPath
  • importlib.readers.MultiplexedPath
  • importlib.resources.readers.MultiplexedPath

The latter 2 should work fine as-is, but the former one is tricky because in python 3.9 readers doesn't exist and it shouldn't be creating a MultiplexedPath when going through the resource_reader. So it shouldn't merit adding importlib_resources as a dependency to python<3.10.

Maybe I should make a different compatibility to importlib/readers.py instead and make MultiplexedPath a dummy class?

- Validate the path types of the entry-points
- Update the entry-points values with the settings value
- Flatten the entry-points to a list or add them as a dict

- Print more debug messages
- Fix the settings type to string

Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
Comment on lines +190 to +192
_handle_search_paths(
"cmake.module", self.settings.search.modules, self.config.module_dirs
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we unify the naming convention across these 3 to the entry-point name? I.e.:

Suggested change
_handle_search_paths(
"cmake.module", self.settings.search.modules, self.config.module_dirs
)
_handle_search_paths(
"cmake.module", self.settings.search.module, self.config.module
)

@henryiii
Copy link
Collaborator

henryiii commented Sep 3, 2024

(Sorry I'm a bit slow reviewing, classes start this week and I'm teaching again)

Comment on lines +311 to +314
search.use-install-prefix = true

# Add the wheel build path to the CMake prefix paths.
search.use-build-prefix = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what these paths are. It's just trying to find site-packages, and sometimes the wrong path is reported (I think possibly if there's a user site-packages?), so we also make sure scikit-build-core's parent directory is included. I think this should be just one setting: search.use-site-packages.


# List or dict of CMake module search paths. Dict from is used to override
# another package's entry-point definition. Populates `CMAKE_MODULE_PATH`.
search.modules = ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think supporting both lists and dicts over complicates it. I'd assume the values are Python modules; if they were paths you could just easily do this from CMake, and we try not to add things that can be done easily from CMake for the most part.

We could possibly replace this with a filter that allowed users to filter out entry points by name (that was my original idea), and everything else could be done from CMake, perhaps?

def _sanitize_path(path: Any) -> list[Path] | None:
if isinstance(path, MultiplexedPath):
# pylint: disable-next=protected-access
return path._paths
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of using a hidden member here. How about using the modern Python's support for directories, and just using the hidden member on older Pythons?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using the modern Python's support for directories

I am not sure I get this part. The MultiplexedPath can point to /rootA/base + /rootB/base and we need the to point to all possible base folders to pass it to CMake. The only support that I know of would be joinpath like support, but that wouldn't help us here. Maybe we could run iterdir without recursion 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modern Python (3.12) can do:

with importlib.resources.as_file(some_directory) as d:
   # d is a real directory

I believe this combines all the directories into a real directory for the duration of the block.

Copy link
Collaborator Author

@LecrisUT LecrisUT Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we need to pass these as str to build the CMakeCacheInit. When I checked the implementation 1 of MultiplexedPath, whenever it transformed to a pathlib.Path and it was pointing to a directory-like path, it would take the first value of _paths, so that would not help here. E.g. in the case of editable install, it can point to either the python sources or the CMake install files, but not both when we cast it to pathlib.Path.

One thing that would work is to iterdir until we hit pathlib.Path and step back from there, but we would need to do that until all MultiplexedPath are resolved in order to make sure all possible paths are resolved.

Footnotes

  1. https://github.com/python/importlib_resources/blob/d84ca376316016420297fbc310ba181ca7d2864d/importlib_resources/readers.py#L104-L111

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is guaranteed to be a single real path to a real directory inside the context manager. You can get a string to it. I believe it copies the merged directory contents to a temporary directory. After the context manager, the temporary directory is removed. The context manager just has to live for the duration of the build.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I checked and indeed that is the case. I'm trying to think if there are some breakages because of that, but nothing related to the current implementation thus far. It could get wonky if we support editable installs pointing to the build directory, but there would be many other weirdness there.

If we use context for this, how do we keep the context alive for all the entry-points during the build-install-archive? E.g. the rpaths could use CMAKE_INSTALL_RPATH_USE_LINK_PATH and we should keep it resolvable until auditwheel can patch those

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think we can't avoid using _paths then.


```toml
[tool.scikit-build.search]
prefixes = ["/path/to/prefixA", "/path/to/prefixB"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are just paths. You can replace this with the following one line in CMakeLists.txt:

list(APPEND CMAKE_PREFIX_PATH /path/to/prefixA /path/to/prefixB)

We should only add features hard to implement in CMake.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to support this anyway because it could be used via overrides and cli. Similarly for the entry-point having a path as value instead of an importable python reference

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entry points are supposed to be modules or objects in modules, not arbitrary strings. The syntax is <module>[:<attribute>].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can still do it via overrides and CLI using defines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess deleting the entry-point would also work. I was considering when packaging for Fedora and we would want the entry-point to point to the system path in /usr/local. But I guess there are other issues with that since the entry-points are defined when building the sdist.

What about checking if the entry-point points to a str or pathlib.Path/traversable or list of those. If not use the current logic of loading importlib.resources.files


# Add site-packages to the prefix path for CMake
site_packages = Path(sysconfig.get_path("purelib"))
self.config.prefix_dirs.append(site_packages)
if self.settings.search.use_install_prefix:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, that's not what this is. We are trying to add site-packages, and in some cases, that's not where scikit-build-core is installed, so we make sure we always add that too. We should probably be adding platlib too.

Copy link
Collaborator Author

@LecrisUT LecrisUT Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so each isolated build path is unique to the dependent packages being built and not to the call of pip install? I was mostly considering about that situation where you would need the prefixes of dependencies that are in build-system.requires only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot view any other site-packages besides the isolated one (that's the isolated part!). The wheel is built separately from installing it. This was just a workaround for an issue where "site-packages" wasn't pointing to the place scikit-build-core was installed, I think due to user site-packages for non-isolated installs, but I don't remember exactly. It could also have been the include-system-site-packages = true setting for a (non-isolated) venv. But it's not install vs. build.

Copy link
Collaborator Author

@LecrisUT LecrisUT Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot view any other site-packages besides the isolated one (that's the isolated part!).

Wait, but we saw that breaking in #860/#862 where the installing site-packages were included. I'll run the build for spglib again to confirm that it was indeed that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this is what I got:

$ pip -v install ./spglib --config-settings=logging.level=DEBUG

  2024-09-09 10:41:40,787 - scikit_build_core - DEBUG - SITE_PACKAGES: /home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages
  2024-09-09 10:41:40,787 - scikit_build_core - DEBUG - Extra SITE_PACKAGES: /home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages
  2024-09-09 10:41:40,798 - scikit_build_core - DEBUG - Default generator: Ninja

    set(CMAKE_PREFIX_PATH [===[/home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages;/tmp/pip-build-env-b9zh9nkx/overlay/lib/python3.12/site-packages]===] CACHE PATH "" FORCE)

note the actual value is in CMAKE_PREFIX_PATH due to a minor bug:

self.config.prefix_dirs.append(DIR.parent.parent)
logger.debug("Extra SITE_PACKAGES: {}", site_packages)

In this case we have /tmp/pip-build-env- site-packages which points to where scikit-build-core was installed, which in the issolated build it would be this "build environment". And what I find is that the numpy is also installed in the same paths, and this is mostly what I was thinking of making configurable.

In contrast if I run it without build-isolation:

$ pip install scikit-build-core numpy
$ pip -v install ./spglib --config-settings=logging.level=DEBUG --no-build-isolation

  2024-09-09 10:53:58,703 - scikit_build_core - DEBUG - SITE_PACKAGES: /home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages
  2024-09-09 10:53:58,703 - scikit_build_core - DEBUG - FindPython backport activated at /home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages/scikit_build_core/resources/find_python
  2024-09-09 10:53:58,706 - scikit_build_core - DEBUG - Default generator: Ninja

    set(CMAKE_PREFIX_PATH [===[/home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages]===] CACHE PATH "" FORCE)

Of course site_packages != DIR.parent.parent is not a good check for if it's in the build-isolation because it intersect with scikit-build-core being editable installed, but in that case we know for sure we are running under --no-build-isolation and the other build dependencies are either in site_packages or their own editable path, which we do not have a way to control without them exporting their entry-point, so it technically still does what it "should do" similar to if scikit-build-core is not editable installed but --no-build-isolation is still used.

site_packages != DIR.parent.parent DIR.parent.parent use-build-prefix
scikit-build-core in isolation True /tmp/pip-build-env-*/... "as intended"
pip install scikit-build-core (without isolation) False venv/... ignored
pip install -e scikit-build-core (without isolation) True ~/scikit-build-core/src edge case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, why is Path(sysconfig.get_path("purelib")) reporting the original venv when this is "isolated"? It seems to me it should report the isolated environment and you should not be able to get to the original one at all. Is that path in sys.path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, how would I test that? It is not reported in the debug log, and if I would run it within the venv than yeah it should be included right?

$ source venv/bin/activate && python3 -c "import sys;print(sys.path)"
['', '/usr/lib64/python312.zip', '/usr/lib64/python3.12', '/usr/lib64/python3.12/lib-dynload', '/home/lecris/CLionProjects/spglib/venv/lib64/python3.12/site-packages', '/home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages']

Probably when running with isolation it would run in a different created environment in which case it would need to call sys.path within the build steps to show its true value right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be reported in the debug log after #896. :)

(And yes, sys.path inside the build steps is needed)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an update on it:

  2024-09-11 11:52:04,356 - scikit_build_core - DEBUG - SITE_PACKAGES: /home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages
  2024-09-11 11:52:04,357 - scikit_build_core - DEBUG - Extra SITE_PACKAGES: /tmp/pip-build-env-oiycakwm/overlay/lib/python3.12/site-packages
  2024-09-11 11:52:04,357 - scikit_build_core - DEBUG - PATH: ['/home/lecris/CLionProjects/spglib/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process', '/tmp/pip-build-env-oiycakwm/site', '/usr/lib64/python312.zip', '/usr/lib64/python3.12', '/usr/lib64/python3.12/lib-dynload', '/tmp/pip-build-env-oiycakwm/overlay/lib/python3.12/site-packages', '/tmp/pip-build-env-oiycakwm/overlay/lib64/python3.12/site-packages', '/tmp/pip-build-env-oiycakwm/normal/lib/python3.12/site-packages', '/tmp/pip-build-env-oiycakwm/normal/lib64/python3.12/site-packages', '/tmp/pip-build-env-oiycakwm/normal/lib/python3.12/site-packages/setuptools/_vendor']

The only path that is in the same venv is from pip/_vendor/pyproject_hooks/_in_process

henryiii added a commit that referenced this pull request Sep 10, 2024
Pulling out this fix from
#880 for inclusion
in a patch release.

---------

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: Cristian Le <cristian.le@mpsd.mpg.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants