Replies: 5 comments 1 reply
-
Can I co-opt this thread to talk about current issues with the conda install (including mamba)? If not, let me know and I'll start a new thread. Both Rich Neale @swrneale and I are still getting 30 min build time for the conda environments. @wrongkindofdoctor asked for the output so I'm attaching it here. Note that I'm running on NCAR linux boxes and Rich has found the same thing on our super computers.
|
Beta Was this translation helpful? Give feedback.
-
Is it possible that the filesystem you are using is the bottleneck?
Building an anaconda environment is very metadata intensive. When we build
environments on our shared BlueArc network filesystem, we can see lengthy
build times similar to this. I encountered a build this week (different
project) that took 4 hours. Building the same environment on a local disk
(i.e. not a network drive) takes about 5 minutes.
Do you have a non network-mounted filesystem you can try?
--
|
Beta Was this translation helpful? Give feedback.
-
I have had some preliminary success building Docker and (finally) Singularity containers with the conda environments and mdtf binary pre-installed, then running them interactively mounted to the inputdata, wkdir, and src/diagnostics directories, as well as the src/default_tests.jsonc file. Assuming that we and the user community have long-term access to the necessary resources to use containers (namely Singularity and/or Docker installed on HPC and/or cloud systems of choice), this could solve the issues of users having to rebuild environments each time they update their local MDTF-diagnostics repos. Instead, they just pull the container from the repository, or build it from the definitions file, mount the container, and run whatever POD(s) they need. Some caveats include:
|
Beta Was this translation helpful? Give feedback.
-
Thanks Jess, looks like you've put a lot of work into this. I have no
experience with containers but our modeling team has put them together to
run CESM tutorials, and I think @andrew Gettelman ***@***.***> was
particularly involved so I can learn from his experience.
As to your caveats:
- do users have to build, or can we just pull and run?
- I can see that we do have Singularity installed on our super computer,
although I think our team before has used Docker, so we might be able to
use either.
- As MDTF is updated and the conda envs are changed, will the container
also need to be updated and changed, or once it is installed will it be
able to access updates without the user doing much?
Thanks again for looking into this!
…On Fri, Sep 24, 2021 at 9:17 AM Jess ***@***.***> wrote:
@bitterbark <https://github.com/bitterbark>
I have had some preliminary success building Docker
<https://github.com/wrongkindofdoctor/MDTF-diagnostics/blob/add_docker_image/Dockerfile>
and (finally) Singularity
<https://github.com/wrongkindofdoctor/MDTF-diagnostics/blob/add_docker_image/Singularity>
containers with the conda environments and mdtf binary pre-installed, then
running them interactively mounted to the inputdata, wkdir, and
src/diagnostics directories, as well as the src/default_tests.jsonc file.
Assuming that we and the user community have long-term access to the
necessary resources to use containers (namely Singularity and/or Docker
installed on HPC and/or cloud systems of choice), this could solve the
issues of users having to rebuild environments each time they update their
local MDTF-diagnostics repos. Instead, they just pull the container from
the repository, or build it from the definitions file, mount the container,
and run whatever POD(s) they need.
Some caveats include:
- users having to learn how to use containers. In my experience,
pulling and running containers is straightfoward, while building containers
is a bit tricker (though the fact that we now have build files would make
it easier to for users to work from). We (me+other container-saavy
community members) can host tutorials, etc... as needed
- Lack of access to Docker and/or Singularity on institutional
machines. GFDL won't host Docker due to potential security issues, but it
is pretty painless to install on a personal machine if needed. We do have
access to Singularity in-house and on the cloud platforms we are testing. I
imagine that NCAR and other larger labs would be open to centrally
installing Singularity given a large enough demand (and containers are a
big deal in the scientific community right now, so there's probably some
external encouragement to get them implemented). I can't say the same for
academic institutions, but I would not be surprised if many already host
container software on their HPC systems, and my prior experience suggests
that system admins would be open to installing any if requested (or helping
users install them on their work machines).
- Large containers taking a long time to install. Including anaconda
and all supported POD environments creates a large image. This could be
reduced by providing separate containers for each POD (which will be
necessary in the long run as the number of supported PODs becomes
prohibitively expensive to install/run in one go), though this is more
cumbersome to support. As we start the next phase, we may want to add a
requirement for new POD developers to maintain containers for their POD (or
at least provide container definitions files)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLJVQHMDYIXEB3Q2L2LWCTUDSJCFANCNFSM427OJRJQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Greetings, Interesting idea. I think containers are adding some more complexity and that is a risk. It's yet another layer for the user to deal with. I also worry about installing containers and having enough memory available to run diagnostic packages. My two cents.... |
Beta Was this translation helpful? Give feedback.
-
I've been playing around with metapackages to help with the installation. @wrongkindofdoctor's suggestion to embrace
mamba
is a game-changer and we could probably find a pathway to simplifying the installation steps.I created a single, unifying MDTF base environment that includes Python 3.8, NCL, and R and defined a meta.yaml. This environment takes forever to resolve using the traditional conda solver and is why Issue #23 is stale. However, this solved rapidly in a minute or so with
mamba
.Once the metapackage is built, the installation of the complete environment would become a one-liner:
This would not eliminate the need for the multi-environment functionality that @tsjackson-noaa wrote however. At some point, NCL is no longer going to be updated so we will need a frozen version of the base environment + a more evolving Python-based environment.
Lots of details to consider still, but presented here as some food for thought.
Beta Was this translation helpful? Give feedback.
All reactions