Skip to content

Commit

Permalink
Version bump 0.1.00.1.1
Browse files Browse the repository at this point in the history
- cleaned up `README.md`
  • Loading branch information
spillai committed Oct 9, 2023
1 parent 63d8481 commit 4e6adaa
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 90 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ jobs:
runs-on: ${{ matrix.os }}
timeout-minutes: 20
strategy:
max-parallel: 3
max-parallel: 2
fail-fast: true
matrix:
os: ["ubuntu-latest"]
python-version: ["3.8", "3.9", "3.10"]
python-version: ["3.7", "3.8", "3.9", "3.10"]
defaults:
run:
shell: bash -el {0}
Expand Down
171 changes: 85 additions & 86 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,66 +16,35 @@

</p>

**agi-pack** is simply a weekend project I hacked together, that started with a conversation with ChatGPT / GPT-4. See the [inspiration](#inspiration-and-attribution-🌟) section below for more details on the ChatGPT prompts used.
📦 **`agi-pack`** allows you to define your Docker images using a simple YAML format, and then generate them on-the-fly using Jinja2 templates. It's a simple tool that aims to simplify the process of building Docker images for ML.

🚨 **Disclaimer:** More than 90% of this codebase was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot).

## Rationale 🤔

Docker has become the standard for building and managing isolated environments for ML. However, any one who has gone down this rabbit-hole knows how broken ML development is, especially when you need to experiment and re-configure your environments constantly. Production is another nightmare -- large docker images (`10GB+`), bloated docker images with model weights that are `~5-10GB` in size, 10+ minute long docker build times, sloppy package management to name just a few.

**What makes Dockerfiles painful?** If you've ever tried to roll your own Dockerfiles with all the best-practices while fully understanding their internals, you'll still find yourself building, and re-building, and re-building these images across a whole host of use-cases. Having to build Dockerfile(s) for `dev`, `prod`, and `test` all turn out to be a nightmare when you add the complexity of hardware targets (CPUs, GPUs, TPUs etc), drivers, python, virtual environments, build and runtime dependencies.

**agi-pack** aims to simplify this by allowing developers to define Dockerfiles in a concise YAML format and then generate them based on your environment needs (i.e. python version, system packages, conda/pip dependencies, GPU drivers etc).

For example, you should be able to easily configure your `dev` environment for local development, and have a separate `prod` environment where you'll only need the runtime dependencies avoiding any bloat.

`agi-pack` hopes to also standardize the base images, so that we can really build on top of giants.

## Features ✨

- **Simple Configuration**: Define your Docker images using a straightforward YAML format.
- **Dynamic Generation**: Use the power of Jinja2 templating to create Dockerfiles on-the-fly.
- **Sequential and Multi-stage Builds**: Define re-usable and production-ready `base` images and build dependent images for `dev`, `prod`, `test`.
- **Extensible**: Easily extend and adapt to more complex scenarios.

## Goals 🎯

- **Simplicity**: Make it easy to define and build docker images for ML.
- **Modular, Re-usable, Composable**: Ability to define good `base`, `dev` and `prod` images for ML, and re-use them wherever possible.
- **Best Practices**: Support best practices for building docker images for ML -- good base images, multi-stage builds, minimal image sizes, etc.
- **Ecosystem-driven**: Make the YAML / DSL extensible to support the ML ecosystem, as more libraries, drivers, HW vendors, come into the market.
- **Vendor-agnostic**: `agi-pack` is not intended to be built for any specific vendor (including us/where I work). There was clearly a need for this tool internally, so I decided to build it in the open and keep it simple.

## Why the name? 🤷‍♂️
`agi-pack` is very much intended to be tongue-in-cheek -- we are soon going to be living in a world full of quasi-AGI agents orchestrated via ML containers. At the very least, `agi-pack` should provide the building blocks for us to build a more modular, re-usable, and distribution-friendly container format for "AGI".
🚨 **Disclaimer:** More than **75%** of this initial implementation was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot). See [attribution](#inspiration-and-attribution-🌟) section below for more details.

## Installation 📦

```bash
pip install git+hhttps://github.com/spillai/agi-pack.git
pip install agi-pack
```

For shell completion, you can install them via:
```bash
agi-pack --install-completion <bash|zsh|fish|powershell|pwsh>
```

## Usage 🛠
## Quickstart 🛠

1. Create a simple YAML configuration file called `agibuild.yaml` via `agi-pack init`:
1. Create a simple YAML configuration file called `agibuild.yaml`. You can use `agi-pack init` to generate a sample configuration file.

```bash
agi-pack init
```

2. Edit `agibuild.yaml` to define your custom system and python packages
2. Edit `agibuild.yaml` to define your custom system and python packages.

```yaml
images:
base-sklearn:
image: <repo>/agi:latest-base-sklearn
base: python:3.8.10-slim
base: debian:buster-slim
system:
- wget
- build-essential
Expand All @@ -86,6 +55,13 @@ agi-pack --install-completion <bash|zsh|fish|powershell|pwsh>
- scikit-learn
```

Let's break this down:
- `base-sklearn`: name of the target you want to build. Usually, these could be variants like `base-*`, `dev-*`, `prod-*`, `test-*` etc.
- `base`: base image to build from.
- `system`: system packages to install via `apt-get install`.
- `python`: specific python version to install via `miniconda`.
- `pip`: python packages to install via `pip install`.
3. Generate the Dockerfile using `agi-pack generate`
```bash
Expand All @@ -101,75 +77,98 @@ agi-pack --install-completion <bash|zsh|fish|powershell|pwsh>
└── `docker build -f Dockerfile --target base-sklearn .`
```
That's it! You can now build the generated Dockerfile using `docker build` to build the image directly.
That's it! Use the generated `Dockerfile` to run `docker build` and build the image directly.

## Goals 🎯

- **Simplicity**: Make it easy to define and build docker images for ML.
- **Best-practices**: Bring best-practices to building docker images for ML -- good base images, multi-stage builds, minimal image sizes, etc.
- **Modular, Re-usable, Composable**: Define `base`, `dev` and `prod` targets with multi-stage builds, and re-use them wherever possible.
- **Extensible**: Make the YAML / DSL extensible to support the ML ecosystem, as more libraries, drivers, HW vendors, come into the market.
- **Vendor-agnostic**: `agi-pack` is not intended to be built for any specific vendor -- I need this tool for internal purposes, but I decided to build it in the open and keep it simple.

## Rationale 🤔

Docker has become the standard for building and managing isolated environments for ML. However, any one who has gone down this rabbit-hole knows how broken ML development is, especially when you need to experiment and re-configure your environments constantly. Production is another nightmare -- large docker images (`10GB+`), bloated docker images with model weights that are `~5-10GB` in size, 10+ minute long docker build times, sloppy package management to name just a few.

**What makes Dockerfiles painful?** If you've ever tried to roll your own Dockerfiles with all the best-practices while fully understanding their internals, you'll still find yourself building, and re-building, and re-building these images across a whole host of use-cases. Having to build Dockerfile(s) for `dev`, `prod`, and `test` all turn out to be a nightmare when you add the complexity of hardware targets (CPUs, GPUs, TPUs etc), drivers, python, virtual environments, build and runtime dependencies.

**agi-pack** aims to simplify this by allowing developers to define Dockerfiles in a concise YAML format and then generate them based on your environment needs (i.e. python version, system packages, conda/pip dependencies, GPU drivers etc).

For example, you should be able to easily configure your `dev` environment for local development, and have a separate `prod` environment where you'll only need the runtime dependencies avoiding any bloat.
`agi-pack` hopes to also standardize the base images, so that we can really build on top of giants.
## More Complex Example 📚
Now imagine you want to build a more complex image that has multiple stages, and you want to build a `base` image that has all the basic dependencies, and a `dev` image that has additional build-time dependencies.
```yaml
images:
base-cpu:
name: agi
base: debian:buster-slim
system:
- wget
python: 3.8.10
pip:
- scikit-learn
run:
- echo "Hello, world!"
```yaml
images:
base-cpu:
name: agi
base: debian:buster-slim
system:
- wget
python: 3.8.10
pip:
- scikit-learn
run:
- echo "Hello, world!"
dev-cpu:
base: base-cpu
system:
- build-essential
```
dev-cpu:
base: base-cpu
system:
- build-essential
```
Once you've defined this `agibuild.yaml`, running `agi-pack generate` will generate the following output:

You should see the following output:
```bash
$ agi-pack generate -c agibuild.yaml
📦 base-cpu
└── 🎉 Successfully generated Dockerfile (target=base-cpu, filename=Dockerfile).
└── `docker build -f Dockerfile --target base-cpu .`
📦 dev-cpu
└── 🎉 Successfully generated Dockerfile (target=dev-cpu, filename=Dockerfile).
└── `docker build -f Dockerfile --target dev-cpu .`
```
```bash
$ agi-pack generate -c agibuild.yaml
📦 base-cpu
└── 🎉 Successfully generated Dockerfile (target=base-cpu, filename=Dockerfile).
└── `docker build -f Dockerfile --target base-cpu .`
📦 dev-cpu
└── 🎉 Successfully generated Dockerfile (target=dev-cpu, filename=Dockerfile).
└── `docker build -f Dockerfile --target dev-cpu .`
```

As you can see, `agi-pack` will generate a **single** Dockerfile for each of the images defined in the YAML file. You can then build the individual images from the same Dockerfile using docker targets: `docker build -f Dockerfile --target <target> .` where `<target>` is the name of the image target you want to build.
As you can see, `agi-pack` will generate a **single** Dockerfile for each of the targets defined in the YAML file. You can then build the individual images from the same Dockerfile using docker targets: `docker build -f Dockerfile --target <target> .` where `<target>` is the name of the image target you want to build.

Here's the corresponding [`Dockerfile`](./examples/generated/Dockerfile-multistage-example) that was generated.
## Why the name? 🤷‍♂️
`agi-pack` is very much intended to be tongue-in-cheek -- we are soon going to be living in a world full of quasi-AGI agents orchestrated via ML containers. At the very least, `agi-pack` should provide the building blocks for us to build a more modular, re-usable, and distribution-friendly container format for "AGI".
## Inspiration and Attribution 🌟
```
Prompt: I'm building a Dockerfile generator and builder to simplify machine learning infrastructure. I'd like for the Dockerfile to be dynamically generated (using Jinja templates) with the following parametrizations:
```
> **TL;DR** `agi-pack` was inspired by a combination of [Replicate's `cog`](https://github.com/replicate/cog), [Baseten's `truss`](https://github.com/basetenlabs/truss/), [skaffold](https://skaffold.dev/), and [Docker Compose Services](https://docs.docker.com/compose/compose-file/05-services/). I wanted a standalone project without any added cruft/dependencies of vendors and services.
```
📦 **agi-pack** is simply a weekend project I hacked together, that started with a conversation with [ChatGPT / GPT-4](#chatgpt-prompt).
# Sample YAML file
images:
base-gpu:
base: nvidia/cuda:11.8.0-base-ubuntu22.04
system:
- gnupg2
- build-essential
- git
python: 3.8.10
pip:
- torch==2.0.1
🚨 **Disclaimer:** More than **75%** of this initial implementation was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot).
I'd like for this yaml file to generate a Dockerfile via `agi-pack generate -c <name>.yaml`.
### ChatGPT Prompt
---
You are an expert in Docker and Python programming, how would I implement this builder in Python. Use Jinja2 templating and miniconda python environments wherever possible. I'd like an elegant and concise implementation that I can share on PyPI.
```
> **Prompt:** I'm building a Dockerfile generator and builder to simplify machine learning infrastructure. I'd like for the Dockerfile to be dynamically generated (using Jinja templates) with the following parametrizations:
TL;DR `agi-pack` was inspired by a combination of [Replicate's `cog`](https://github.com/replicate/cog), [Baseten's `truss`](https://github.com/basetenlabs/truss/), [skaffold](https://skaffold.dev/), and [Docker Compose Services](https://docs.docker.com/compose/compose-file/05-services/). I wanted a standalone project without any added cruft/dependencies of vendors and services.
```yaml
# Sample YAML file
images:
base-gpu:
base: nvidia/cuda:11.8.0-base-ubuntu22.04
system:
- gnupg2
- build-essential
- git
python: 3.8.10
pip:
- torch==2.0.1
```
> I'd like for this yaml file to generate a Dockerfile via `agi-pack generate -c <name>.yaml`. You are an expert in Docker and Python programming, how would I implement this builder in Python. Use Jinja2 templating and miniconda python environments wherever possible. I'd like an elegant and concise implementation that I can share on PyPI.
## Contributing 🤝
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ build-backend = "setuptools.build_meta"

[project]
name = "agi-pack"
version = "0.1.0"
version = "0.1.1"
description = "Dockerfile generator for AGI -- nothing more, nothing less."
license = {file = "LICENSE"}
readme = "README.md"
requires-python = ">=3.7.10"
requires-python = ">=3.7"
classifiers = [
"Development Status :: 4 - Beta",
"Programming Language :: Python",
Expand Down

0 comments on commit 4e6adaa

Please sign in to comment.