This repository contains the BentoML template for model API deployments.
Before you begin, ensure you have met the following requirements:
- install python
- install pip
- install aws and login to your account.
- A model stored in a S3 bucket in
.pickle
format.
To use this template, clone the repository and customize it according to your model's requirements. Below is a quick start guide:
-
Clone the repository:
git clone https://github.com/infraspecdev/bentoml-template.git cd bentoml-template
-
Customize the template:
- Update
config.ini
to download the correct models from your S3 bucket. - Update
validations.py
to change the input validation for your model. - Update
service.py
to use your model. - Update
bentofile.yaml
If you have changed the service name inservice.py
.
- Update
-
Run these commands to create a python environment:
python3 -m venv .venv source .venv/bin/activate
-
Run these commands to install all the dependencies:
pip install -r requirements.txt
Note: If you want to use iris model run the following command to store the model in
/models
python train_and_save_model.py
-
Run these commands to start the service:
bentoml serve .
This will start a local server at http://localhost:3000
-
Build the bento model:
bentoml build .
-
Build the image:
bentoml containerize iris_classifier_service:<BUILD_VERSION>
-
Run the container:
docker run --rm -p 8080:8080 iris_classifier_service:<BUILD_VERSION>
The build version will be provided as the output of the bentoml build
command. This will look something similar
to IrisClassifierService:nftm2tqyagzp4mtu
. In this example, nftm2tqyagzp4mtu
is the build
version. For this quickstart example, the name is IrisClassifierService
, but you need to replace it with the name of your service class.
Note: Update the envs in the bentofile.yaml
envs:
- name: <ENV_VARIABLE_NAME>
value: <VALUE>
- Create a
.env
file with values similar to the given.env.example
file. - Change the values in the
.env
according to your requirement.
- BENTOML_PORT: Port on which the BentoML service will run.
- JWT_SECRET: Secret key used for signing JWT tokens. This should be a secure, randomly generated string.
- JWT_EXPIRATION_MINUTES: Duration (in minutes) for which the JWT token remains valid.
- ENVIRONMENT: Environment in which the service is running. Can be set to
development
,staging
, orproduction
. - LOG_LEVEL: Logging level for the application. Can be set to
DEBUG
,INFO
,WARNING
,ERROR
, orCRITICAL
. Default isWARNING
.
To download the models from S3, update the bucket name and directory in the configs/config.ini
file.
- Replace
YOUR_S3_BUCKET_NAME
with the actual s3 bucket name. - Replace
YOUR_S3_BUCKET_DIRECTORY
with the directory path of the models in the s3.
The models can be downloaded by running:
python3 download_models.py
On a local machine, this will require AWS secret and access keys to download from S3.
To deploy your specific model API using the provided BentoML template, you can follow the following points:
- Download or Import the Required Model and Libraries: Replace the current file name for the model If you want to run a custom model or use import to import the model
with open("./models/<YOUR_MODEL_FILE_NAME>", "rb") as model_file:
model = pickle.load(model_file)
- Update the class name: Change the class name to your model specific class name.
class WeatherPrediciton
You will need to update the class name in thebentofile.yaml
also,
service: "service:WeatherPrediciton"
- Modify the API Route: Update the API route to match the model's endpoint.
@bentoml.api(route='/api/v1/analyze')
You will need to update the routes in following files as well.
# middleware/validate_jwt.py
protected_routes = ['/api/v1/analyze']
# middlewares/request_response_handler.py
routes_to_log = ["/api/v1/analyze"]
# middlewares/validation_handler.py
routes_to_validate = ['/api/v1/analyze']
#utils/common/validations.py
return {"/api/v1/analyze": WeatherPredicitonParams}
- Update the Request Validation Schema: Update the input validation(Params) in `utils/common/validations.py
class IrisRequestParams(BaseModel):
sepal_length: float = Field(description="Sepal length in cm", gt=0)
sepal_width: float = Field(description="Sepal width in cm", gt=0)
petal_length: float = Field(description="Petal length in cm", gt=0)
petal_width: float = Field(description="Petal width in cm", gt=0)
The import statements, class name, route, inference logic, and middleware will be based on your specific use-case for model API development. The examples provided above are for reference only to give you an idea of the components you need to change.
In this example, we used a iris classifier model. If you are using a different model or framework, these components might need to be completely different. You should code your class based on your specific requirements and the model you are using.
In the quickstart sample, the /api/v1/predict
endpoint requires the JWT token in the request authorization headers
to authenticate the request. If any other route which needs to be authenticated before serving the request, add the
endpoint in the protected_routes
list in middlewares/validate_jwt.py
.
To add more protected routes, update the protected_routes
list:
# middlewares/validate_jwt.py
protected_routes = [
'/api/v1/predict',
'/api/v1/analyze'
]
curl -X 'POST' \
'http://localhost:<BENTOML_PORT>/api/v1/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: <JWT_TOKEN>' \
-d '{
"sepal_length": 1,
"sepal_width": 2,
"petal_length": 3,
"petal_width": 4
}'
Replace <BENTOML_PORT>
and <JWT_TOKEN>
with their values. Change api/v1/predict
and the request body with the new service route and new body if updated.
To generate the JWT token:
python3 utils/jwt/generate_token.py
You can change the token expiry and secret by changing the environment variables JWT_EXPIRATION_MINUTES
and JWT_SECRET
in the .env
file.
You can run this in your terminal to generate a JWT_SECRET
.
date | base64 | base64