Skip to content

Commit

Permalink
Containerise mx bluesky (#187)
Browse files Browse the repository at this point in the history
* Changes to containerise hyperion and enable deployment to kubernetes and podman * New build_docker_image.sh script to build an image on the cli * New run_in_podman.sh script to run the image from podman * Github CI workflow to build container image on release and manual execution and push to GHCR registry * Helm charts to deploy container images to kubernetes * Hyperion now has --version option to report the current version * The current version is now set automatically on pip install

* Ensure the appVersion is set in the helmchart. Update documentation. Allow production to mount source folders.

* Update deployment instructions, deploy_hyperion.sh for use with k8s

* Fix healtcheck Add editable dodal to image and helmcharts Allow the appversion to be specified at deployment Allow existing helmcharts to be upgraded

* Make the docker image smaller

* Fix version name mangling

* Fix unit tests

* Rationalise dockerfiles, hyperion deployment documentation

* Rename utility scripts, enhance documentation, help

* Change to docker image versioning strategy

* Use opencv-python-headless to avoid dependencies on libGL, desktop etc.
Optimise the dockerfile so it takes less time to build when iterating deployment script
Extract the image version from when the image is built rather than from the parent workspace
Try to make sure the image is from clean rather than dirty workspace

* Remove dev environment from ci

* Reinstate old Dockerfile and rename release one

* Make docs and yaml linters happy

* Make yaml linter even more happy

* Update the runAsUser/runAsGroup to match the new i03-hyperion user

* Add ingress and external DNS

* Changes responding to PR comments:

* Allow service and container ports to be configurable (in yaml at least)
* Minor fixes to documentation
* Integrate running of the deploy_hyperion.py script into deploy_hyperion_to_k8s.sh
* Sanity check for checked out vs image version in deployment
* Minor change to deploy_hyperion.py to be able to get the install folder
* By default the deploy_hyperion_to_k8s.sh will now log into the k8s cluster
* Fix ghcr login case issue

* Fix issues with deployment script

* Additional usage checks for run_hyperion_in_podman.sh

* Fix deploy script when login is enabled
  • Loading branch information
rtuck99 committed Sep 10, 2024
1 parent 08e69ef commit 8c7a0f9
Show file tree
Hide file tree
Showing 24 changed files with 1,022 additions and 52 deletions.
12 changes: 12 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# List of folders and files to be excluded from the docker image
.devcontainer
.github
.pytest_cache
.ruff_cache
**/__pycache__/

# virtualenv stuff - this gets built by the docker script
.venv
activate

tmp
4 changes: 0 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@ jobs:
matrix:
runs-on: ["ubuntu-latest"] # can add windows-latest, macos-latest
python-version: ["3.11"] # , "3.12"] # add 3.12 when p4p #145 is fixed
include:
# Include one that runs in the dev environment
- runs-on: "ubuntu-latest"
python-version: "dev"
fail-fast: false
uses: ./.github/workflows/_test.yml
with:
Expand Down
47 changes: 47 additions & 0 deletions .github/workflows/publish_docker_image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Publish Docker Image
on:
release:
types: [published]
# Allow the workflow to be triggered manually
workflow_dispatch:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build_and_push_image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout
# v4.1.7
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
- name: Log in to GHCR
# v3.3.0
uses: docker/login-action@9780b0c442fbb1117ed29e0efdff1e18412f7567
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
# v5.5.1
uses: docker/metadata-action@8e5442c4ef9f78752691e2d8f8d19755c6f78e81
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
# v6.5.0
uses: docker/build-push-action@5176d81f87c23d6fc96624dfdbcd9f3830bbe445
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: Generate artifact attestation
# v1.4.0
uses: actions/attest-build-provenance@210c1913531870065f03ce1f9440dd87bc0938cd
with:
subject-name: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME}}
subject-digest: ${{ steps.push.outputs.digest }}
push-to-registry: true
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ repos:
args: ["--maxkb=500"]
- id: check-yaml
args: ["--allow-multiple-documents"]
exclude: ^helmchart/
- id: check-merge-conflict
- id: end-of-file-fixer
- id: no-commit-to-branch
Expand Down
38 changes: 38 additions & 0 deletions Dockerfile.release
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
FROM python:3.11 AS build

RUN pip install setuptools_scm

# Copy the pyproject.toml and install dependencies for better caching when developing
# & rerunning deployment scripts
COPY pyproject.toml /app/hyperion/
WORKDIR "/app/hyperion"
RUN mkdir -p src/mx_bluesky

# This enables us to cache the pip install without needing _version.py
# see https://setuptools-scm.readthedocs.io/en/latest/usage/
RUN SETUPTOOLS_SCM_PRETEND_VERSION_FOR_MX_BLUESKY=1.0.0 pip install \
--no-cache-dir --no-compile -e .

# Check out and install dodal locally with no dependencies as this may be a different version to what
# is referred to in the setup.cfg, but we don't care as it will be overridden by bind mounts in the
# running container
RUN mkdir ../dodal && \
git clone https://github.com/DiamondLightSource/dodal.git ../dodal && \
pip install --no-cache-dir --no-compile --no-deps -e ../dodal

#
# Everything above this line should be in the image cache unless pyproject.toml changes
#
ADD .git /app/hyperion/.git
# Restore the repository at the current commit instead of copying, to exclude uncommitted changes
# This is so that if you build a developer image from this dockerfile then _version.py will not
# append the dirty workdir hash, which causes complications during deployments that mount from a clean folder.
RUN git restore .

# Regenerate _version.py with the correct version - this should run quickly since we already have our dependencies
RUN rm src/mx_bluesky/_version.py
RUN pip install --no-cache-dir --no-compile -e .

ENTRYPOINT /app/hyperion/utility_scripts/docker/entrypoint.sh

EXPOSE 5005
109 changes: 109 additions & 0 deletions docs/developer/hyperion/deploying-hyperion.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
Building a deployable Docker image
==================================

Release builds of container images should be built by the github CI on release, ad-hoc builds can be performed via
manual invocation of the Publish Docker Image workflow.

Development builds of container images can be made by running the ``utility_scripts/build_docker_image.sh`` script.
By default it will both build and push the image unless you specify ``--no-build`` or ``--no-push``. To push an image
you will first need to create a GH personal access token and then log in with podman as described below.

Pushing the docker image
------------------------

Obtaining a GitHub access token
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You will need to obtain a GitHub personal access token (classic) - not the new fine-grained token.
It will need the specific permission scopes as detailed in the `ghcr documentation <https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-with-a-personal-access-token-classic>`_

Building the image
~~~~~~~~~~~~~~~~~~

If building a test image, the image should be pushed to your personal GH account:

::

cat <mysecretfile> | podman login ghcr.io --username <your gh login> --password-stdin

where ``mysecretfile`` contains your personal access token

Then run the ``build_docker_image.sh`` script.

Troubleshooting
~~~~~~~~~~~~~~~

If you run into issues with ``podman build .`` failing with the error message
``io: read/write on closed pipe`` then you may be running out of disk space - try setting TMPDIR environment variable

https://github.com/containers/podman/issues/22342

Building image on ubuntu
~~~~~~~~~~~~~~~~~~~~~~~~

If you run into issues such as

::

potentially insufficient UIDs or GIDs available in user namespace (requested 0:42 for /etc/gshadow): Check /etc/subuid and /etc/subgid: lchown /etc/gshadow: invalid argument

* Ensure newuidmap is installed

::

sudo apt-get install uidmap

* Add appropriate entries to ``/etc/subuid`` and ``/etc/subgid`` e.g.

::

# subuid/subgid file
myuser:10000000:65536

* kill any existing podman processes and retry

For further information, see https://github.com/containers/podman/issues/2542


Deploying to kubernetes
-----------------------

Once the docker image is built, the image can be deployed to kubernetes using the ``deploy_hyperion_to_k8s.sh`` script

Production deployment
~~~~~~~~~~~~~~~~~~~~~

Then create and deploy the helm release

::

./utility_scripts/deploy/deploy_hyperion_to_k8s.sh --beamline=<beamline> --checkout-to-prod hyperion

This will run the ``deploy_hyperion.py`` script to deploy the latest hyperion to ``/dls_sw``.
You will be prompted to log into the beamline cluster, then it will create a helm release "hyperion".
The source folders will be mounted as bind mounts to allow the pod to pick up changes in production.
For production these are expected to be in the normal place defined in ``values.yaml``.

Development deployment
~~~~~~~~~~~~~~~~~~~~~~

From a development ``hyperion`` workspace, either with a release image or using a development image built with the
script
above, you install a dev deployment to the cluster you are currently logged into with ``kubectl``:

::

./utility_scripts/deploy/deploy_hyperion_to_k8s.sh --dev --beamline=<beamline> --repository=<your image repo> hyperion-test


The dev deployment bind-mounts the current ``hyperion`` workspace and ``../dodal`` into the container so that you can
run against your own development code. **Clusters do not allow bind mounts from arbitrary directories so
your workspace will have to be in a permitted directory such as your home directory.**

By default the script will log into the ``argus`` cluster, if you want to deploy to an alternate cluster,
log in with ``kubectl set-context --current --namespace=<NAMESPACE>`` and then specify ``--no-login`` when running the
script

Please note, the deployment script is intended to be run from a checked-out matching version of the git repository.

``helm list`` should then show details of the installed release on a successful install
1 change: 1 addition & 0 deletions docs/developer/hyperion/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Documentation is split into four categories, and each is also accessible from li

reference/param-hierarchy
reference/readme
deploying-hyperion

+++

Expand Down
2 changes: 1 addition & 1 deletion docs/user/how-to/run-container.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ Starting the container

To pull the container from github container registry and run::

$ docker run ghcr.io/DiamondLightSource/mx-bluesky:main --version
$ docker run ghcr.io/diamondlightsource/mx-bluesky:main --version

To get a released version, use a numbered release instead of ``main``.
6 changes: 6 additions & 0 deletions helmchart/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
name: hyperion
description: Hyperion server
type: application
# version of the chart
version: 0.0.1
111 changes: 111 additions & 0 deletions helmchart/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: hyperion-deployment
spec:
selector:
matchLabels:
app: hyperion
replicas: 1
template:
metadata:
labels:
app: hyperion
spec:
securityContext:
# gda2
runAsUser: {{ .Values.hyperion.runAsUser }}
runAsGroup: {{ .Values.hyperion.runAsGroup }}
supplementalGroups: {{ .Values.hyperion.supplementalGroups }}
volumes:
- name: dls-sw-bl
hostPath:
path: "/dls_sw/{{ .Values.hyperion.beamline }}"
type: Directory
- name: dls-sw-apps
hostPath:
path: "/dls_sw/apps"
type: Directory
- name: dls-sw-dasc
hostPath:
path: "/dls_sw/dasc"
type: Directory
# Bind some source folders for easier debugging
- name: src
hostPath:
path: "{{ .Values.hyperion.projectDir }}/src"
type: Directory
- name: tests
hostPath:
path: "{{ .Values.hyperion.projectDir }}/tests"
type: Directory
- name: utility-scripts
hostPath:
path: "{{ .Values.hyperion.projectDir }}/utility_scripts"
type: Directory
- name: dodal
hostPath:
path: "{{ .Values.dodal.projectDir | clean }}"
type: Directory
{{- if .Values.hyperion.dev }}
- name: devlogs
hostPath:
path: "{{ .Values.hyperion.projectDir }}/tmp"
type: Directory
{{- end }}
containers:
- name: hyperion
image: {{ .Values.hyperion.imageRepository}}/hyperion:{{ .Values.hyperion.appVersion }}
resources:
limits:
cpu: "1"
memory: "1Gi"
ports:
- name: hyperion-api
containerPort: {{ .Values.hyperion.containerPort }}
protocol: TCP
env:
- name: HYPERION_LOG_DIR
value: {{ .Values.hyperion.logDir }}
- name: BEAMLINE
value: "{{ .Values.hyperion.beamline }}"
{{- if not .Values.hyperion.dev }}
- name: ZOCALO_GO_USER
value: "gda2"
- name: ZOCALO_GO_HOSTNAME
value: "{{ .Values.hyperion.beamline }}-control"
- name: ZOCALO_CONFIG
value: "/dls_sw/apps/zocalo/live/configuration.yaml"
- name: ISPYB_CONFIG_PATH
value: "/dls_sw/dasc/mariadb/credentials/ispyb-hyperion-{{ .Values.hyperion.beamline }}.cfg"
args: [ "--external-callbacks" ]
{{- end }}
readinessProbe:
exec:
command: [ "/app/hyperion/utility_scripts/docker/healthcheck.sh" ]
periodSeconds: 5
volumeMounts:
- mountPath: "/dls_sw/{{ .Values.hyperion.beamline }}"
name: dls-sw-bl
readOnly: true
mountPropagation: HostToContainer
- mountPath: "/dls_sw/apps"
name: dls-sw-apps
readOnly: true
mountPropagation: HostToContainer
- mountPath: "/dls_sw/dasc"
name: dls-sw-dasc
readOnly: true
mountPropagation: HostToContainer
- mountPath: "/app/hyperion/src"
name: src
- mountPath: "/app/hyperion/tests"
name: tests
- mountPath: "/app/hyperion/utility_scripts"
name: utility-scripts
- mountPath: "/app/dodal"
name: dodal
{{- if .Values.hyperion.dev }}
- mountPath: "/app/hyperion/tmp"
name: devlogs
{{ end }}
23 changes: 23 additions & 0 deletions helmchart/templates/ingress.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{{- if not .Values.hyperion.dev }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hyperion-ingress
spec:
ingressClassName: nginx
tls:
- hosts:
- {{ .Values.hyperion.externalHostname }}
rules:
- host: {{ .Values.hyperion.externalHostname }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: hyperion-svc # this must match the name of the service you want to target
port:
number: {{ .Values.hyperion.containerPort }}
{{- end }}

Loading

0 comments on commit 8c7a0f9

Please sign in to comment.