From 4e6adaaeb97e2a2c0e91c7c4cf7be77ac7e335bb Mon Sep 17 00:00:00 2001 From: Sudeep Pillai Date: Sun, 8 Oct 2023 20:53:01 -0700 Subject: [PATCH] =?UTF-8?q?Version=20bump=20`0.1.0`=20=E2=86=92=20`0.1.1`?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - cleaned up `README.md` --- .github/workflows/ci.yml | 4 +- README.md | 171 +++++++++++++++++++-------------------- pyproject.toml | 4 +- 3 files changed, 89 insertions(+), 90 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index f085813..d4e80a0 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -12,11 +12,11 @@ jobs: runs-on: ${{ matrix.os }} timeout-minutes: 20 strategy: - max-parallel: 3 + max-parallel: 2 fail-fast: true matrix: os: ["ubuntu-latest"] - python-version: ["3.8", "3.9", "3.10"] + python-version: ["3.7", "3.8", "3.9", "3.10"] defaults: run: shell: bash -el {0} diff --git a/README.md b/README.md index bec075c..65269b0 100644 --- a/README.md +++ b/README.md @@ -16,44 +16,14 @@

-**agi-pack** is simply a weekend project I hacked together, that started with a conversation with ChatGPT / GPT-4. See the [inspiration](#inspiration-and-attribution-🌟) section below for more details on the ChatGPT prompts used. +📦 **`agi-pack`** allows you to define your Docker images using a simple YAML format, and then generate them on-the-fly using Jinja2 templates. It's a simple tool that aims to simplify the process of building Docker images for ML. -🚨 **Disclaimer:** More than 90% of this codebase was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot). - -## Rationale 🤔 - -Docker has become the standard for building and managing isolated environments for ML. However, any one who has gone down this rabbit-hole knows how broken ML development is, especially when you need to experiment and re-configure your environments constantly. Production is another nightmare -- large docker images (`10GB+`), bloated docker images with model weights that are `~5-10GB` in size, 10+ minute long docker build times, sloppy package management to name just a few. - -**What makes Dockerfiles painful?** If you've ever tried to roll your own Dockerfiles with all the best-practices while fully understanding their internals, you'll still find yourself building, and re-building, and re-building these images across a whole host of use-cases. Having to build Dockerfile(s) for `dev`, `prod`, and `test` all turn out to be a nightmare when you add the complexity of hardware targets (CPUs, GPUs, TPUs etc), drivers, python, virtual environments, build and runtime dependencies. - -**agi-pack** aims to simplify this by allowing developers to define Dockerfiles in a concise YAML format and then generate them based on your environment needs (i.e. python version, system packages, conda/pip dependencies, GPU drivers etc). - -For example, you should be able to easily configure your `dev` environment for local development, and have a separate `prod` environment where you'll only need the runtime dependencies avoiding any bloat. - -`agi-pack` hopes to also standardize the base images, so that we can really build on top of giants. - -## Features ✨ - -- **Simple Configuration**: Define your Docker images using a straightforward YAML format. -- **Dynamic Generation**: Use the power of Jinja2 templating to create Dockerfiles on-the-fly. -- **Sequential and Multi-stage Builds**: Define re-usable and production-ready `base` images and build dependent images for `dev`, `prod`, `test`. -- **Extensible**: Easily extend and adapt to more complex scenarios. - -## Goals 🎯 - -- **Simplicity**: Make it easy to define and build docker images for ML. -- **Modular, Re-usable, Composable**: Ability to define good `base`, `dev` and `prod` images for ML, and re-use them wherever possible. -- **Best Practices**: Support best practices for building docker images for ML -- good base images, multi-stage builds, minimal image sizes, etc. -- **Ecosystem-driven**: Make the YAML / DSL extensible to support the ML ecosystem, as more libraries, drivers, HW vendors, come into the market. -- **Vendor-agnostic**: `agi-pack` is not intended to be built for any specific vendor (including us/where I work). There was clearly a need for this tool internally, so I decided to build it in the open and keep it simple. - -## Why the name? 🤷‍♂️ -`agi-pack` is very much intended to be tongue-in-cheek -- we are soon going to be living in a world full of quasi-AGI agents orchestrated via ML containers. At the very least, `agi-pack` should provide the building blocks for us to build a more modular, re-usable, and distribution-friendly container format for "AGI". +🚨 **Disclaimer:** More than **75%** of this initial implementation was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot). See [attribution](#inspiration-and-attribution-🌟) section below for more details. ## Installation 📦 ```bash -pip install git+hhttps://github.com/spillai/agi-pack.git +pip install agi-pack ``` For shell completion, you can install them via: @@ -61,21 +31,20 @@ For shell completion, you can install them via: agi-pack --install-completion ``` -## Usage 🛠 +## Quickstart 🛠 -1. Create a simple YAML configuration file called `agibuild.yaml` via `agi-pack init`: +1. Create a simple YAML configuration file called `agibuild.yaml`. You can use `agi-pack init` to generate a sample configuration file. ```bash agi-pack init ``` -2. Edit `agibuild.yaml` to define your custom system and python packages +2. Edit `agibuild.yaml` to define your custom system and python packages. ```yaml images: base-sklearn: - image: /agi:latest-base-sklearn - base: python:3.8.10-slim + base: debian:buster-slim system: - wget - build-essential @@ -86,6 +55,13 @@ agi-pack --install-completion - scikit-learn ``` + Let's break this down: + - `base-sklearn`: name of the target you want to build. Usually, these could be variants like `base-*`, `dev-*`, `prod-*`, `test-*` etc. + - `base`: base image to build from. + - `system`: system packages to install via `apt-get install`. + - `python`: specific python version to install via `miniconda`. + - `pip`: python packages to install via `pip install`. + 3. Generate the Dockerfile using `agi-pack generate` ```bash @@ -101,75 +77,98 @@ agi-pack --install-completion └── `docker build -f Dockerfile --target base-sklearn .` ``` -That's it! You can now build the generated Dockerfile using `docker build` to build the image directly. +That's it! Use the generated `Dockerfile` to run `docker build` and build the image directly. + +## Goals 🎯 + +- **Simplicity**: Make it easy to define and build docker images for ML. +- **Best-practices**: Bring best-practices to building docker images for ML -- good base images, multi-stage builds, minimal image sizes, etc. +- **Modular, Re-usable, Composable**: Define `base`, `dev` and `prod` targets with multi-stage builds, and re-use them wherever possible. +- **Extensible**: Make the YAML / DSL extensible to support the ML ecosystem, as more libraries, drivers, HW vendors, come into the market. +- **Vendor-agnostic**: `agi-pack` is not intended to be built for any specific vendor -- I need this tool for internal purposes, but I decided to build it in the open and keep it simple. + +## Rationale 🤔 + +Docker has become the standard for building and managing isolated environments for ML. However, any one who has gone down this rabbit-hole knows how broken ML development is, especially when you need to experiment and re-configure your environments constantly. Production is another nightmare -- large docker images (`10GB+`), bloated docker images with model weights that are `~5-10GB` in size, 10+ minute long docker build times, sloppy package management to name just a few. + +**What makes Dockerfiles painful?** If you've ever tried to roll your own Dockerfiles with all the best-practices while fully understanding their internals, you'll still find yourself building, and re-building, and re-building these images across a whole host of use-cases. Having to build Dockerfile(s) for `dev`, `prod`, and `test` all turn out to be a nightmare when you add the complexity of hardware targets (CPUs, GPUs, TPUs etc), drivers, python, virtual environments, build and runtime dependencies. + +**agi-pack** aims to simplify this by allowing developers to define Dockerfiles in a concise YAML format and then generate them based on your environment needs (i.e. python version, system packages, conda/pip dependencies, GPU drivers etc). + +For example, you should be able to easily configure your `dev` environment for local development, and have a separate `prod` environment where you'll only need the runtime dependencies avoiding any bloat. + +`agi-pack` hopes to also standardize the base images, so that we can really build on top of giants. ## More Complex Example 📚 Now imagine you want to build a more complex image that has multiple stages, and you want to build a `base` image that has all the basic dependencies, and a `dev` image that has additional build-time dependencies. - ```yaml - images: - base-cpu: - name: agi - base: debian:buster-slim - system: - - wget - python: 3.8.10 - pip: - - scikit-learn - run: - - echo "Hello, world!" +```yaml +images: + base-cpu: + name: agi + base: debian:buster-slim + system: + - wget + python: 3.8.10 + pip: + - scikit-learn + run: + - echo "Hello, world!" - dev-cpu: - base: base-cpu - system: - - build-essential - ``` + dev-cpu: + base: base-cpu + system: + - build-essential +``` Once you've defined this `agibuild.yaml`, running `agi-pack generate` will generate the following output: - You should see the following output: - ```bash - $ agi-pack generate -c agibuild.yaml - 📦 base-cpu - └── 🎉 Successfully generated Dockerfile (target=base-cpu, filename=Dockerfile). - └── `docker build -f Dockerfile --target base-cpu .` - 📦 dev-cpu - └── 🎉 Successfully generated Dockerfile (target=dev-cpu, filename=Dockerfile). - └── `docker build -f Dockerfile --target dev-cpu .` - ``` +```bash +$ agi-pack generate -c agibuild.yaml +📦 base-cpu +└── 🎉 Successfully generated Dockerfile (target=base-cpu, filename=Dockerfile). + └── `docker build -f Dockerfile --target base-cpu .` +📦 dev-cpu +└── 🎉 Successfully generated Dockerfile (target=dev-cpu, filename=Dockerfile). + └── `docker build -f Dockerfile --target dev-cpu .` +``` -As you can see, `agi-pack` will generate a **single** Dockerfile for each of the images defined in the YAML file. You can then build the individual images from the same Dockerfile using docker targets: `docker build -f Dockerfile --target .` where `` is the name of the image target you want to build. +As you can see, `agi-pack` will generate a **single** Dockerfile for each of the targets defined in the YAML file. You can then build the individual images from the same Dockerfile using docker targets: `docker build -f Dockerfile --target .` where `` is the name of the image target you want to build. Here's the corresponding [`Dockerfile`](./examples/generated/Dockerfile-multistage-example) that was generated. +## Why the name? 🤷‍♂️ +`agi-pack` is very much intended to be tongue-in-cheek -- we are soon going to be living in a world full of quasi-AGI agents orchestrated via ML containers. At the very least, `agi-pack` should provide the building blocks for us to build a more modular, re-usable, and distribution-friendly container format for "AGI". + ## Inspiration and Attribution 🌟 -``` -Prompt: I'm building a Dockerfile generator and builder to simplify machine learning infrastructure. I'd like for the Dockerfile to be dynamically generated (using Jinja templates) with the following parametrizations: -``` +> **TL;DR** `agi-pack` was inspired by a combination of [Replicate's `cog`](https://github.com/replicate/cog), [Baseten's `truss`](https://github.com/basetenlabs/truss/), [skaffold](https://skaffold.dev/), and [Docker Compose Services](https://docs.docker.com/compose/compose-file/05-services/). I wanted a standalone project without any added cruft/dependencies of vendors and services. - ``` +📦 **agi-pack** is simply a weekend project I hacked together, that started with a conversation with [ChatGPT / GPT-4](#chatgpt-prompt). - # Sample YAML file - images: - base-gpu: - base: nvidia/cuda:11.8.0-base-ubuntu22.04 - system: - - gnupg2 - - build-essential - - git - python: 3.8.10 - pip: - - torch==2.0.1 +🚨 **Disclaimer:** More than **75%** of this initial implementation was generated by GPT-4 and [Github Co-Pilot](https://github.com/features/copilot). - I'd like for this yaml file to generate a Dockerfile via `agi-pack generate -c .yaml`. +### ChatGPT Prompt +--- - You are an expert in Docker and Python programming, how would I implement this builder in Python. Use Jinja2 templating and miniconda python environments wherever possible. I'd like an elegant and concise implementation that I can share on PyPI. - ``` +> **Prompt:** I'm building a Dockerfile generator and builder to simplify machine learning infrastructure. I'd like for the Dockerfile to be dynamically generated (using Jinja templates) with the following parametrizations: -TL;DR `agi-pack` was inspired by a combination of [Replicate's `cog`](https://github.com/replicate/cog), [Baseten's `truss`](https://github.com/basetenlabs/truss/), [skaffold](https://skaffold.dev/), and [Docker Compose Services](https://docs.docker.com/compose/compose-file/05-services/). I wanted a standalone project without any added cruft/dependencies of vendors and services. +```yaml +# Sample YAML file +images: + base-gpu: + base: nvidia/cuda:11.8.0-base-ubuntu22.04 + system: + - gnupg2 + - build-essential + - git + python: 3.8.10 + pip: + - torch==2.0.1 +``` +> I'd like for this yaml file to generate a Dockerfile via `agi-pack generate -c .yaml`. You are an expert in Docker and Python programming, how would I implement this builder in Python. Use Jinja2 templating and miniconda python environments wherever possible. I'd like an elegant and concise implementation that I can share on PyPI. ## Contributing 🤝 diff --git a/pyproject.toml b/pyproject.toml index fe5a71b..68d3ad8 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,11 +4,11 @@ build-backend = "setuptools.build_meta" [project] name = "agi-pack" -version = "0.1.0" +version = "0.1.1" description = "Dockerfile generator for AGI -- nothing more, nothing less." license = {file = "LICENSE"} readme = "README.md" -requires-python = ">=3.7.10" +requires-python = ">=3.7" classifiers = [ "Development Status :: 4 - Beta", "Programming Language :: Python",