Use FastAPI in order to create an API for your model.
Run that API on your machine. Then put it in production.
Now that we have a performant model trained in the cloud, we will expose it to the world π
We will create a Prediction API for our model, run it on our machine in order to make sure that everything works correctly. Then we will deploy it in the cloud so that everyone can play with our model!
In order to do so, we will:
- Challenge 1 : create a Prediction API using FastAPI
- Challenge 2 : create a Docker image containing the environment required in order to run the code of our API
- Challenge 3 : push this image to Google Cloud Run so that it is instantiated as a Docker container that will run our code and allow developers all over the world to use it
Copy your .env
file from the previous package version:
cp ~/<user.github_nickname>/{{local_path_to('07-MLOps/03-Automate-model-lifecycle/01-Automate-model-lifecycle')}}/.env .env
OR
Use the env.sample
provided, replacing the environment variable values by yours.
A new taxifare/api
directory has been added to the project to contain the code of the API along with 2 new configuration files within the challenge project directory:
.
βββ Dockerfile # π NEW Building instructions
βββ MANIFEST.in # π NEW Config file for production purpose
βββ Makefile # Good old task manager
βββ README.md
βββ requirements.txt # All the dependencies you need to run the package
βββ setup.py # Package installer
βββ taxifare
β βββ api # π NEW API directory
β β βββ __init__.py
β β βββ fast.py # π NEW Where the API lays
β βββ data_sources # Data stuff
β βββ flow # DAG stuff
β βββ interface # Package entry point
β βββ ml_logic # ML stuff
βββ tests
Now, have a look at the requirements.txt
. You can see new comers:
# API
fastapi # API framework
pytz # Timezones management
uvicorn # Web server
# tests
httpx # HTTP client
pytest-asyncio # Asynchronous I/O support for pytest
β How?
make reinstall_package
of course π
We provide you with with a FastAPI skeleton in the fast.py
file.
π» Launch the API
π‘ Hint
You probably need a uvicorn
web server..., with a π₯ reloading...
In case you can't find the proper syntax, keep calm and look at your Makefile
, we provided you with a new task run_api
.
β How do you consult your running API?
Answer
π‘ Your API is available on a local port, 8000
probably π http://localhost:8000.
Go visit it!
You have probably not seen much.
β Which endpoints are available?
Answer
There is only one endpoint partially implemented at the moment, the root endpoint /
.
The unimplemented root page is a little raw, remember you can always find more info on the API using the swagger endpoint π http://localhost:8000/docs
An API is defined by its specifications. E.g. GitHub repositories API. You will find below the API specifications you need to implement.
- GET
/
- Response Status: 200
{
'greeting': 'Hello'
}
π» Implement the Root endpoint /
π Look at your browser π http://localhost:8000
π Inspect the server logs and add some breakpoint()
to debug
Once and only once your API responds as required:
π§ͺ Test your implementation with make test_api_root
π Commit and push your code!
- GET
/predict
- Query parameters
Name Type Sample pickup_datetime DateTime 2013-07-06 17:18:00
pickup_longitude float -73.950655
pickup_latitude float 40.783282
dropoff_longitude float -73.950655
dropoff_latitude float 40.783282
passenger_count int 2
- Response
Status 200
- Code sample
Example response:
GET http://localhost:8000/predict?pickup_datetime=2013-07-06 17:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2
{ 'fare_amount': 5.93 }
β How would you proceed to implement the /predict
endpoint? π¬ Discuss with your buddy.
β‘οΈ Kickstart pack
Here is a piece of code you can use to kickstart the implementation:```python
@app.get("/predict")
def predict(pickup_datetime: datetime, # 2013-07-06 17:18:00
pickup_longitude: float, # -73.950655
pickup_latitude: float, # 40.783282
dropoff_longitude: float, # -73.984365
dropoff_latitude: float, # 40.769802
passenger_count: int):
pass # YOUR CODE HERE
```
π‘ Hints
Ask yourselves the following questions:
- How should we handle the query parameters?
- How can we re-use the
taxifare
model package? - How should we build
X_pred
? What does it look like? - How to render the correct response?
βοΈ Configuration
Have you put a trained model in Production in mlflow? If not, you can use the following configuration:
``` bash
MODEL_TARGET=mlflow
MLFLOW_TRACKING_URI=https://mlflow.lewagon.ai
MLFLOW_EXPERIMENT=taxifare_experiment_krokrob
MLFLOW_MODEL_NAME=taxifare_krokrob
```
π Food for thought
- Investigate the data types of the query parameters, you may need to convert them into the types the model requires
- Of course you must re-use the
deepCab.interface.main.pred()
or thedeepCab.ml_logic.registry.load_model()
functions! - In order to make a prediction with the trained model, you must provide a valid
X_pred
but thekey
is missing! - FastAPI can only render data type from the Python Standard Library, you may need to convert
y_pred
to match this requirement
π Inspect your browser response π http://localhost:8000/predict?pickup_datetime=2013-07-06%2017:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2
π Inspect the server logs and add some breakpoint()
to debug
Once and only once your API responds as required:
π§ͺ Test your implementation with make test_api_predict
π Commit and push your code!
π Congrats, you build your first ML predictive API!
We now have a working predictive API which can be queried from our local machine.
We want to make it available to the world. In order to do that, the first step is to create a Docker image that contains the environment required to run the API and make it run locally on Docker.
β What are the 3 steps to run the API on Docker?
Answer
- Create a
Dockerfile
containing the the instructions to build the API - Build the image locally on Docker
- Run the API on Docker locally to check it is responding as required
You need Docker daemon to run on your machine so you will be able to build and run the image locally.
π» Launch Docker daemon
β
Check Docker daemon is up and running with docker info
in your terminal
A nice stack of logs should print:
As a reminder, here is the project directory structure:
.
βββ Dockerfile # π Building instructions
βββ MANIFEST.in # π Config file for production purpose
βββ Makefile # Good old task manager
βββ README.md # Package documentation
βββ requirements.txt # All the dependencies you need to run the package
βββ setup.py # Package installer
βββ taxifare
β βββ api # β
API directory
β β βββ __init__.py
β β βββ fast.py # β
Where the API lays
β βββ data_sources # Data stuff
β βββ flow # DAG stuff
β βββ interface # Package entry point
β βββ ml_logic # ML logic
βββ tests # Your favorite π
β What are the key ingredients a Dockerfile
needs to cook a delicious Docker image?
Answer
Here the most common instructions of good Dockerfile
:
FROM
: select a base image for our image (the environment in which we will run our code), this is usually the first instructionCOPY
: copy files and directories inside of our image (our package and the associated files for example)RUN
: execute a command inside of the image being built (for example, install the package dependencies)CMD
: execute the main command that will be executed when we run our Docker image. There can be only oneCMD
instruction inside of aDockerfile
. It is usually the last instruction
β What the base image should contain to build our image on top of it?
π‘ Hints
Choosing an image with Python already installed could be a nice start...
π» Write the instructions needed to build the API image in the Dockerfile
with the following specifications:
- β it should contain the same Python version of your virtual env
- β
it should contain the necessary directories from the
/taxifare
directory to allow the API to run - β it should contain the dependencies list
- β the API depencies should be installed
- β the web server should be launched when the container is started from the image
- β
the web server should listen to the HTTP requests coming from outside the container (cf
host
parameter) - β
the web server should be able listen to a specific port defined by an environment variable
$PORT
(cfport
parameter)
β‘οΈ Kickstart pack
Here is the skeleton of the Dockerfile
:
FROM image
COPY taxifare
COPY dependencies
RUN install dependencies
CMD launch API web server
π¨ Apple Silicon users, expand me and read carefully
You will not be able to test your container locally with the tensorflow package since the current version does not install properly on Apple Silicon machines.
The solution is to use one image to test your code locally and another one to push your code to production.
π Refer to the commands in the Dockerfile_silicon
file in order to build and test your local image and build and deploy to production your production image
β How would you check if the Dockerfile
instructions will execute what you wanted?
Answer
You can't at this point! π You need to build the image and check if it contains everything required to run the API. Go to the next section: Build the API image.
Now is the time to build the API image on Docker so you can check if it satisfies the requirements and be able to run it on Docker.
β How do you build an image with Docker?
βοΈ Configuration
You may add a variable to your project configuration for the docker image name. You will be able to reuse it in the docker
commands:
IMAGE=image-name
Answer
Make sure you are in the directory of the Dockefile
then:
docker build --tag=$IMAGE .
π» Choose a meaningful name for the API image then build it
Once built, the image should be visible in the list of images built with the following command:
docker images
π΅οΈββοΈ The image you are looking for does not appear in the list? Ask for help πββοΈ
Now the image is built let's check it satisfies the specifications to run the predictive API. Docker comes with a handy command to interactively communicate with the shell of the image:
docker run -it -e PORT=8000 -p 8000:8000 $IMAGE sh
π€ Decrypt
docker run $IMAGE
run the image-it
enable the interactive mode-e PORT=8000
specify the environment variable$PORT
the image should listen tosh
launch a shell console
A shell console should open, you are inside the image π.
π» Check the image is correctly set up:
- β The python version is the same as your virtual env
- β
Presence of the
/taxifare
directory - β
Presence of the
requirements.txt
- β The dependencies are all installed
π Solution
python --version
to check the Python versionls
to check the presence of the files and directoriespip list
to check the requirements are installed
Exit the terminal and stop the container at any moment with:
exit
β
β All good? If something is missing, you would probably need to fix your Dockerfile
and re-build the image again
In the previous section you learned how to interact with the image shell. Now is the time to run the predictive API image and test if the API responds as it should.
π» Run the image
π‘ Hints
You should probably remove the interactivity mode and forget the sh
command...
π Unless you find the correct command to run the image, it is probably crashing with errors involving environment variable.
β What is the difference between your local environment and image environment? π¬ Discuss with your buddy.
Answer
There is no .env
in the image!!! The image has no access to the environment variables π
π» Using the docker run --help
documentation, adapt the run command so the .env
is sent to the image
π Solution
The --env-file
parameter to the rescue!
docker run -e PORT=8000 -p 8000:8000 --env-file path/to/.env $IMAGE
β How would check the image runs correctly?
π‘ Hints
The API should respond in your browser, go visit it!
Also you can check the image runs with:
docker ps
It's Alive! π± π
π Inspect your browser response π http://localhost:8000/predict?pickup_datetime=2013-07-06%2017:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2
π Congrats, you build your first ML predictive API inside a Docker container!
Now we have built a predictive API Docker image that we are able to run on our local machine, we are 2 steps away from deploying:
- Push the Docker image to Google Container Registry
- Deploy the image on Google Cloud Run so that it gets instantiated into a Docker container
As a responsible ML Engineer, you know the size of an image is important when it comes to production. Depending the choice of the base image you used in your Dockerfile
, the API image could be huge:
python:3.8.12-buster
π3.9GB
python:3.8.12-slim
π3.1GB
python:3.8.12-alpine
π3.1GB
β What is the heaviest requirement used by your API?
Answer
No doubt it is tensorflow
with 1.1GB! You need to find a base image that is optimized for it.
π Change your base image
π» Build and run a lightweight local image of your API
β Make sure the API is still up and running
π Inspect the space saved with docker images
and feel happy
Hints
You may want to use a tensorflow docker image.
β What is the purpose of Google Container Registry ?
Answer
Google Container Registry is a service storing Docker images on the cloud with the purpose of allowing Cloud Run or Kubernetes Engine to serve them.
It is in a way similar to GitHub allowing you to store your git repositories in the cloud (except for the lack of a dedicated user interface and additional services such as forks
and pull requests
).
First, let's make sure to enable Google Container Registry API for your project in GCP.
Once this is done, let's allow the docker
command to push an image to GCP.
gcloud auth configure-docker
Now we are going to build our image again. This should be pretty fast since Docker is pretty smart and is going to reuse all the building blocks used previously in order to build the prediction API image.
Add a GCR_MULTI_REGION
variable to your project configuration and set it to eu.gcr.io
.
docker build -t $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE .
Again, let's make sure that our image runs correctly, so that we avoid spending the time on pushing an image that is not working to the cloud.
docker run -e PORT=8000 -p 8000:8000 --env-file path/to/.env $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE
Visit http://localhost:8000/ and check the API is running as expected.
We can now push our image to Google Container Registry.
docker push $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE
The image should be visible in the GCP console here.
Add a MEMORY
variable to your project configuration and set it to 2Gi
.
π This will allow your container to run with 2GB of memory
β How does Cloud Run know the value of the environment variables to pass to your container? π¬ Discuss with your buddy.
Answer
It does not. You need to provide a list of environment variables to your container when you deploy it π
π» Using the gcloud run deploy --help
documentation, identify a parameter allowing to pass environment variables to your container on deployment
π Solution
The --env-vars-file
is the correct one!
gcloud run deploy --env-vars-file .env.yaml
Tough luck, the --env-vars-file
parameter takes as input the name of a yaml
file containing the list of environment variables to pass to the container.
π» Create a .env.yaml
file containing the list of environment variables to pass to your container
You can use the provided .env.sample.yaml
file as a source for the syntax (do not forget to update the value of the parameters).
π Solution
Create a new .env.yaml
file containing the values of your .env
file in the yaml
format:
DATASET_SIZE: 10k
VALIDATION_DATASET_SIZE: 10k
CHUNK_SIZE: "2000"
π All values should be strings
β What is the purpose of Cloud Run?
Answer
Cloud Run will instantiate the image into a container and run the CMD
instruction inside of the Dockerfile
of the image. This last step will start the uvicorn
server serving our predictive API to the world π
Let's run one last command π€
gcloud run deploy --image $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE --memory $MEMORY --region $GCR_REGION --env-vars-file .env.yaml
After confirmation, you should see a similar output indicating that the service is live π
Service name (wagon-data-tpl-image):
Allow unauthenticated invocations to [wagon-data-tpl-image] (y/N)? y
Deploying container to Cloud Run service [wagon-data-tpl-image] in project [le-wagon-data] region [europe-west1]
β Deploying new service... Done.
β Creating Revision... Revision deployment finished. Waiting for health check to begin.
β Routing traffic...
β Setting IAM Policy...
Done.
Service [wagon-data-tpl-image] revision [wagon-data-tpl-image-00001-kup] has been deployed and is serving 100 percent of traffic.
Service URL: https://wagon-data-tpl-image-xi54eseqrq-ew.a.run.app
Any developer in the world π is now able to browse to the deployed url and make a prediction using the API π€!
π Congrats, you deployed your first ML predictive API!
You may stop (or kill) the image...
docker stop 152e5b79177b # β οΈ use the correct CONTAINER ID
docker kill 152e5b79177b # β’οΈ only if the image refuses to stop (did someone create an β loop?)
Remember to stop the Docker daemon in order to free resources on your machine once you are done using it...
MacOSX
Stop the Docker.app
with Quit Docker Desktop in the menu bar.
Windows WSL2/Ubuntu
Stop the Docker app.