-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add controls for numpy version #678
Add controls for numpy version #678
Conversation
Updating Dockerfile and Compose build config file for ngen-based worker images (calibration, partitioning, testing, etc.) to utilize ARG sourced from environment config to control the version of numpy that is installed.
Updating example for DMOD deployment environment config with section for numpy, new numpy constraint environmental variable used by worker image builds, safe default value for new variable, and comments explaining.
Updating project requirements.txt and GUI service requirements file (python/gui/dependencies.txt) to constrain numpy version consistent with default used by worker image builds set up in example.env.
docker/main/docker-build.yml
Outdated
@@ -59,6 +60,7 @@ services: | |||
context: ./ngen | |||
target: rocky_ngen_build_testing | |||
args: | |||
NUMPY_CONSTRAINT: ${NUMPY_INSTALL_CONSTRAINT:?Please set global NUMPY_INSTALL_CONSTRAINT in environment config} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an abuse of the args parameter. Do not use this to inject python versions. Stick to setup.py and requirements.txt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think was should instead add a requirements.txt
for the ngen
image and put things like this there. I think that will easiest to maintain and understand in the long run. I think it also results in a simpler solution. Thoughts?
@aaraney, @christophertubbs , there are subtle implications to this in the context of building a Docker image. While there may be others as well, at minimum:
We can, of course, add subtle complexity to the way we design what goes into to requirement files, whether we use different ones for different build stages, etc. But if we start going down that road, I think what we end up with is not any better of a solution. Honestly, if allowing configurable control of the version really seems like a problem, what are both of your thoughts on just hard-coding the numpy version constraints into the Dockerfile in the prior places where it was being installed? |
If the issue is with a single requirements file... there's nothing keeping us from having a single requirements file. We can also produce new build 'hooks' where extra files may be COPYed in if behavior needs to be altered in different stages. What we have, right now, is already too complicated. We need to start moving in another direction to find an easier to follow build pipeline. If not making something configurable makes the build process prohibitively resource intensive, something is wrong and we need to take a step back and evaluate our process(es). For now, hard code the value since it would be global anyways. If a user needs to color outside the lines and use a different value, they may change it themselves. Ideally, DMOD needs to get out of the game of setting up compilation rules for another product. We don't set compilation flags or environment variables just to prepare pandas for use. DMOD is a consumer of NextGen and compatible formulations. Instructions and constraints for use need to be provided from them, not developed by us. If the DMOD image creation process is responsible for setting this up we're putting ourselves in a position of eternal catch up. |
You aren't wrong, but that's out of scope for what is essentially a bug fix.
Just to be clear, the lack of configurability doesn't make the builds less efficient. But artificially tethering together the installation of all the Python packages used throughout the worker images would.
Sure, I can make that change for here, and we can separate off the idea of adding (or not) configurability to numpy versioning for workers.
Providing compatible, consistent, reproducible compute environments (whether these be Docker images/containers or some other mechanism) for model execution is a fundamental feature of DMOD, and that means it has to be concerned with compilation flags and environment details and anything else users would have to deal with to set up an environment if they were just trying to use ngen directly. There is certainly room to discuss which of these DMOD opens up via control/configure mechanisms versus what DMOD just handles automagically behind the scenes. And if/when ngen can add its own automation for installing its dependencies like numpy or begins distributing installation packages, great; DMOD can leverage those instead. Until it does though, for DMOD to be able to run ngen, there’s just no way around dealing with this kind of stuff. |
I think NOAA-OWP/ngen#856 will solve our issues an obsolete this PR. We will still at some point need to revisit how we handle external bmi module dependency versioning, but that is a whole can of worms that I think we should kick down the road! |
@aaraney, no, I think NOAA-OWP/ngen#856 will just cause DMOD image builds to fail for a different reason (the CMake error instead of the unit test failure). Unless ngen actually handles installing numpy itself, we are going to have to address what version of numpy gets installed in the image. As discussed above, I'll make changes so that we hard code this for now to the latest 1.x. |
Extra context, there is already a check in |
Alright, so after a side channel conversation with @robertbartel, at least IMO, I think we can achieve the goals of this PR using a In doing so, this lets us control what version or version ranges of certain packages are installed in the image. Likewise, incompatibilities will results in a build failure. |
@@ -290,14 +292,15 @@ RUN if [[ "${NETCDF_CXX_VERSION}" == "latest" ]]; then \ | |||
FROM rocky-base as rocky-ngen-packaged-deps | |||
|
|||
ARG ROCKY_NGEN_DEPS_REQUIRED | |||
ARG NUMPY_CONSTRAINT | |||
|
|||
# TODO: later, go back and change all pip3/python3 to just pip/python (but leave for now to limit scope) | |||
# Note that this includes numpy, which is needed for Python BMI support, regardless of BMI module | |||
USER root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure we could move the required python packages to a requirements.txt
file (so, pip
, wheel
, packaging
, and numpy
) and then create a constraints.txt
that we use to enforce the allowed versions of the required and other packages in constraints.txt
. Thoughts?
USER root | |
USER root | |
COPY requirements.txt ${WORKDIR}/requirements.txt | |
COPY constraints.txt ${WORKDIR}/constraints.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find! I think it's a good solution.
It's been a while since I've had to worry about this, but I believe that editing the constraints.txt
and creating a new hash for the file would cause a new docker build
to start at the line for the constraints.txt
file , so I think we can be sure that a change there will just about always be reflected after a build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't like the idea of coupling all the Python packages together in a requirements.txt, as I don't think that fits with how the Dockerfile is set up. It's totally fair to discuss whether that needs to change - e.g., should we really support building the image both with or without Python support (we don't really need numpy if in the without option)) - but it needs to be discussed separately.
But the PIP_CONSTRAINT
and constraints.txt settings sound fine. Those alone should be sufficient to resolve the immediate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christophertubbs, you raise a good point I hadn't consider: depending on when the constraints.txt file is setup in the image that will once again have implications on build caching and efficiency. That isn't an absolute blocker, but I want to think on it a little more to see if there's anyway to avoid that tradeoff. For the sake of getting the immediate issue fixed, I'm going to go ahead and prepare the changes to go this route, but I'm not sure I want it to be permanent just yet. But that can be spun off to a separate issue and dealt with later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't like the idea of coupling all the Python packages together in a requirements.txt, as I don't think that fits with how the Dockerfile is set up.
Yeah, I agree. Im just talking about the required packages to build ngen
with python support. So, numpy
, pip
, wheel
and packaging
for now. It just seems like a nice place to keep those baseline required packages rather than inlining them in the Dockerfile
.
docker/main/ngen/Dockerfile
Outdated
|
||
# TODO: later, go back and change all pip3/python3 to just pip/python (but leave for now to limit scope) | ||
# Note that this includes numpy, which is needed for Python BMI support, regardless of BMI module | ||
USER root | ||
RUN dnf update -y && dnf install -y ${ROCKY_NGEN_DEPS_REQUIRED} && dnf clean -y all \ | ||
&& ln -s $(which python3) $(which python3 | sed 's/python3/python/') \ | ||
&& pip install --no-cache-dir "pip>=23.0,<23.1" wheel packaging \ | ||
&& if [ "${NGEN_WITH_PYTHON}" == "ON" ]; then pip install --no-cache-dir numpy; fi | ||
&& if [ "${NGEN_WITH_PYTHON}" == "ON" ]; then pip install --no-cache-dir numpy${NUMPY_CONSTRAINT}; fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would become:
&& if [ "${NGEN_WITH_PYTHON}" == "ON" ]; then pip install --no-cache-dir numpy${NUMPY_CONSTRAINT}; fi | |
&& if [ "${NGEN_WITH_PYTHON}" == "ON" ]; then pip install --no-cache-dir --requirement requirements.txt --constraint constraints.txt ; fi |
Rolling back initial changes in favor of a different approach.
Adding constraints file to ngen (and related) image build to control numpy and pip versions as needed for compatibility, using this instead of doing directly within the Dockerfile itself in pip commands.
@@ -0,0 +1,5 @@ | |||
### Constrain pip | |||
pip>=23.0,<23.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary after NOAA-OWP/t-route#804 was merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't have to be addressed here, just putting it on your radar @robertbartel!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Thanks, @robertbartel!
@christophertubbs, any other concerns before this moves forward? I might either need explicit approval from you or to dismiss your previous review, in order to comply with branch protections (the latter of which I don't want to do until you've had a chance to raise any remaining discussion items). |
Changes reengineered using different approach based on previous review discussion.
Rolling back initial changes in favor of a different approach.
Add support for environment-config-controlled constraints on the version of numpy installed by the image build process for ngen-related Docker images.(Note: scrapping original idea in favor of a different approach.)Add support for pip-constraints-file based control over the version of numpy installed by the image build process for ngen-related Docker images.
Slightly related to NOAA-OWP/ngen#837.
Note that this blocks #675, as that revision of the image can't be built for testing due to the problem being fixed here.
Additions
New environment config variable for numpy constraint, with default value and comments in example.envRemovals
ARG
MIN_NUMPY
for clarity.Changes
New DockerfileARG
for numpy constraint, applied appropriately where numpy gets installed by image buildSetting of aforementionedARG
in Compose build yaml configs, for all ngen-based images/services (e.g., model exec, calibration, etc.)COPY
andENV
statements in ngen image Dockerfile to bring in new pip constraint and then setPIP_CONSTRAINT
environment variable so that pip uses it.Testing