Getting started with a serverless endpoint on RunPod by creating a custom worker
- Features
- Setup
- Local testing
- Build and release your Docker image
- Use your Docker image on RunPod serverless
- Interact with your RunPod endpoint
- Where to go from here?
- Acknowledgments
- Credits
This project provides a set of starting points for creating your worker (= Docker image) to create a custom serverless endpoint on RunPod:
- Simple start script that makes sure to start the handler and whatever you need your worker to have, so that it can do its work (like starting ComfyUI)
- Basic handler, that you can extend with the business logic that you need for your use case
- GitHub dev workflow during the development of your Docker image
- GitHub release workflow when you want to publish a fully automated new version of your Docker image
- Foundation for unit tests to get started with Test-driven development (TDD)
- Clone the repo to your computer
- Create a virtual environment:
python -m venv venv
- Activate the virtual environment:
- Windows:
.\venv\Scripts\activate
- Mac / Linux:
source ./venv/bin/activate
- Windows:
- Install the dependencies:
pip install -r requirements.txt
- Make sure to have Docker installed on your computer if you want to build the image locally
Execute python src/rp_handler.py
, which will then output something like this:
--- Starting Serverless Worker | Version 1.3.7 ---
INFO | Using test_input.json as job input.
DEBUG | Retrieved local job: {'input': {'greeting': 'world'}, 'id': 'local_test'}
INFO | local_test | Started
DEBUG | local_test | Handler output: Hello world
DEBUG | local_test | run_job return: {'output': 'Hello world'}
INFO | Job local_test completed successfully.
INFO | Job result: {'output': 'Hello world'}
INFO | Local testing complete, exiting.
Run all tests: python -m unittest discover
, which will then output something like this:
...
----------------------------------------------------------------------
Ran 3 tests in 0.000s
OK
We included a docker-compose.yml which makes it possible to easily run the Docker image locally: docker-compose up
This will only work for Linux-based systems, as we only create an image for Linux, as this is what RunPod requires. To do this for Mac or Windows, you have to follow the steps to build the image manually. Make sure to build the image with the dev
tag, as this is used in the docker-compose.yml.
To use your Docker image on RunPod, it must exist in a Docker image registry. We are using Docker Hub for this, but feel free to choose whatever you want.
The repo contains two workflows that publish the image to Docker Hub using GitHub Actions: dev.yml and release.yml.
This process is highly opinionated and you should adapt it to what you are used to.
If you want to use these workflows, you have to add these secrets to your repository:
Configuration Variable | Description | Example Value |
---|---|---|
DOCKERHUB_USERNAME |
Your Docker Hub username. | your-username |
DOCKERHUB_TOKEN |
Your Docker Hub token for authentication. | your-token |
DOCKERHUB_REPO |
The repository on Docker Hub where the image will be pushed. | timpietruskyblibla |
DOCKERHUB_IMG |
The name of the image to be pushed to Docker Hub. | runpod-worker-helloworld |
When you are developing your image and want to provide bug fixes or features to your community, you can put them into the dev
branch. This will trigger the dev workflow, which runs these steps:
- Execute the unit tests
- Build the image
- Push the image to Docker Hub using the
dev
tag
When development is done and you are ready for a new release of your image, you can put all your changes into the main
branch. This will trigger the release workflow, which runs these steps:
- Execute the unit tests
- Update "Table of Contents" in the
README.md
- Create a release on GitHub using Semantic Versioning based on semantic-release (which only works if you follow one of the commit message formats, the default are the Angular Commit Message Conventions as you can also see in this repo)
- Update the
CHANGELOG.md
- Build the image
- Push the image to Docker Hub and tag it with both the release version and
latest
- Update the description of the image on Docker Hub
- Build your Docker image like this
docker build -t <dockerhub_username>/<repository_name>:<tag> --platform linux/amd64 .
, in this case:docker build -t timpietruskyblibla/runpod-worker-helloworld:latest --platform linux/amd64 .
- We need to specify the platform here, as this is what RunPod requires. If you don't do this, you might see an error like
exec python failed: Exec format error
when you run your worker on RunPod, depending on the OS you are using locally - If you want to run your image locally and you are not using a Linux-based OS, then you have to use the appropriate platform:
- Windows:
docker build -t <dockerhub_username>/<repository_name>:<tag> --platform windows/amd64 .
- MacOS with Apple Silicon:
docker build -t <dockerhub_username>/<repository_name>:<tag> --platform linux/arm64 .
- Windows:
- We need to specify the platform here, as this is what RunPod requires. If you don't do this, you might see an error like
- After the image is created, you can see it when you run
docker images
, which provides a list of all images that exist on your computer
- Create an account on Docker Hub if you don't have one already
- Login to your account:
docker login
- Push your Docker image to Docker Hub like this
docker push <dockerhub_username>/<repository_name>:<tag>
, in this case:docker push timpietruskyblibla/runpod-worker-helloworld:latest
- Once this is done, you can check your Docker Hub account to find the image
- Create a new template by clicking on
New Template
- In the dialog, configure:
- Template Name:
runpod-worker-helloworld
(it can be anything you want) - Container Image:
<dockerhub_username>/<repository_name>:tag
, in this case:timpietruskyblibla/runpod-worker-helloworld:latest
- Template Name:
- You can leave everything as it is, as this repo is public
- Click on
Save Template
- Navigate to
Serverless > Endpoints
and click onNew Endpoint
- In the dialog, configure:
- Endpoint Name:
hellworld
- Select Template:
runpow-worker-helloworld
(or what ever name you gave your template) - Active Workers:
0
(keep this low, as we just want to test the Hello World) - Max Workers:
3
(recommended default is 3) - Idle Timeout:
5
(leave the default) - Flash Boot:
enabled
(doesn't cost more, but provides faster boot for our worker, which is good) - Advanced: Leave the defaults
- Select a GPU that has some availability
- GPUs/Worker:
1
(keep this low as we are just testing, we don't need multiple GPUs for a hello world)
- Endpoint Name:
- Click
deploy
- Your endpoint will be created, you can click on it to see the dashboard and also the available API methods:
runsync
: Sync request to start a job, where you can wait for the job resultrun
: Async request to start a job, where you receive anid
immediatelystatus
: Sync request to find out what the status of a job is, givenid
cancel
: Sync request to cancel a job, givenid
health
: Sync request to check the health of the endpoint to see if everything is fine
- In the User Settings click on
API Keys
and then on theAPI Key
button - Save the generated key somewhere, as you will not be able to see it again when you navigate away from the page
- Use cURL or any other tool to access the API using the API key and your Endpoint-ID:
- Replace
<api_key>
with your key - Replace
<endpoint_id>
with the ID of the endpoint, you find that when you click on your endpoint, it's part of the URLs shown at the bottom of the first box
- Replace
curl -H "Authorization: Bearer <api_key>" https://api.runpod.ai/v2/<endpoint_id>/health
This will return an id
that you can then use in the status
endpoint to find out if your job was completed.
# Returns a JSON with the id of the job (<job_id>), use that in the status endpoint
curl -X POST -H "Authorization: Bearer <api_key>" -H "Content-Type: application/json" -d '{"input": {"greeting": "world"}}' https://api.runpod.ai/v2/<endpoint_id>/run
# {"id":"<job_id>","status":"IN_QUEUE"}
This endpoint will wait until the job is done and provide the output of our API as the response.
If you do a sync request to an endpoint that has no free workers to pick up the job, you will wait for some time. Either your job will be picked up if a worker gets free or the job gets added to the queue (provided by the endpoint), which will result in you receiving an id
. You then have to manually ask the /status
endpoint to get information about when the job was completed.
curl -X POST -H "Authorization: Bearer <api_key>" -H "Content-Type: application/json" -d '{"input": {"greeting": "world"}}' https://api.runpod.ai/v2/<endpoint_id>/runsync
# {"delayTime":2218,"executionTime":138,"id":"<job_id>","output":"Hello world","status":"COMPLETED"}
curl -H "Authorization: Bearer <api_key>" https://api.runpod.ai/v2/<endpoint_id>/status/<job_id>
# {"delayTime":3289,"executionTime":1054,"id":"<job_id>","output":"Hello world","status":"COMPLETED"}
- RunPod Workers: A list of workers provided by RunPod
- Thanks to Justin Merrell from RunPod for this nice getting started guide that was used to create this hello-world guide
- The title image was generated with Broom (which runs on RunPod using runpod-worker-comfy to generate images with text in ComfyUI using SDXL 1.0)