Materials that accompany the talk MLOps with GitHub Actions & Kubernetes
Below is a collection of GitHub Actions that we are curating or building that facilitate machine learning workflows:
- Action: ChatOps From Pull Requests: Listens to ChatOps commands in PRs and emits variables that downstream Actions can branch on.
Argo allows you to orechestrate machine learning pipelines that run on Kubernetes.
- Action: Submit Argo Workflows on GKE - leverages the gcloud cli to authenticate to your GKE cluster and submit argo workflows.
- Action: Submit Argo Workflows on K8s (Cloud agnostic) - requires that you supply a kubeconfig file to authenticate to your k8 cluster.
- Action: Fetch runs from Weights & Biases - W&B is an experiment tracking and logging system for machine learning, and is free for open source projects.
- Action: Publish Container To The GitHub Package Registry. See this doc on more information on the GitHub Package Registry
- Action: Publish Container To a Generic Registry
- Action: To compile, deploy and run Kubeflow pipeline. This action allows you to instantiate Kubeflow pipelines from GitHub directly.
See this demo explaining this project and more background on what MLOps is and why it is needed.
The code-review process re: Machine Learning often involves making decisions about merging or deploying code where critical information regarding model performance and statistics are not readily available. This is due to the friction in including logging and statistics from model training runs in Pull Requests. For example, consider this excerpt from a real pull-request concerning a machine learning project:
In an ideal world, the participants in the above code review should be provided with all of the context necessary to evaluate the PR, including:
- Model performance metrics and statistics
- Comparison with baselines and other models on a holdout dataset
- Verification that the metrics and statistics correspond to the code changed in the PR, by tying the results to a commit SHA.
- Data versioning
- etc.
GitHub Actions allow you to compose a set of pre-built CI/CD tools or make your own, allowing you to compose a workflow that enables MLOps from GitHub. The below example composes the following Actions into useful pipeline:
ChatOps → Deploy Argo ML Workflows → Weights & Biases Experiment Tracking -> Deploy Model:
View the demo pull request here. What is shown above is only the tip of the iceberg!
- .github/workflows/
- chatops.yaml: This workflow files handles two different scenarios (1) when I want to execute a full model run with the command
/run-full-test
and (2) when I want to deploy a model using the chatops command/deploy <run_id>
. Note that you do not need to use chatops for your workflow, this was just the author's preferred way of triggering items. You can use one of the many other events that can trigger Actions. Furthermore, these chatops commands uses a pre-defined action machine-learning-apps/actions-chatops@master that performs an Action by authenticating another GitHub app. The steps taken in this workfow trigger either the workflow defined inml-cicd.yaml
ordeploy.yaml
. - ml-cicd.yaml: This workflow is triggered by the chatops command
/run-full-test
from events that occur in thechatoops.yaml
file. This executes the full training run of the model. - deploy.yaml: This workflow is triggered by the chatops command
/deploy <run_id>
. This workflow fetches the appropriate model artificacts associated with the<run_id>
from the experiment tracking system (which is Weights & Biases in this case), and deploys this model using Google Cloud Functions. - repo-dispatch.yaml: This workflow is triggered at the end of the Argo Workflow created in the step
Submit Argo Deployment
inml-cicd.yaml
. The terminal nodes of the Argo workflow creates a repository dispatch event which triggers this workflow. - see-payload.yaml & see_token.yaml - these files were used for debugging and can be safely ignored.
- chatops.yaml: This workflow files handles two different scenarios (1) when I want to execute a full model run with the command
- /action_files: these are a collection of shell scripts and python files that are run at various steps in the workflow files mentioned above.
- /src - these are the files that define the pre-processing and training of the model. These files are copied into the appropriate Docker container images in the workflow when the workflow is triggered.
The example in this repo is end-to-end and requires familiarity with Kubernetes and GitHub Actions to fully understand. When starting out, we recommend automating one part of your workflow, such as deploying models. As you learn more about the syntax of GitHub Actions you can increase the scope of your workflow as appropriate.
We also encourage you to make GitHub Actions for others to use to accomodate other tools.
For any questions, please open an issue in this repo.