Skip to content

juan-garassino/deepCab

Repository files navigation

Objective

Use FastAPI in order to create an API for your model.

Run that API on your machine. Then put it in production.

Context

Now that we have a performant model trained in the cloud, we will expose it to the world 🌍

We will create a Prediction API for our model, run it on our machine in order to make sure that everything works correctly. Then we will deploy it in the cloud so that everyone can play with our model!

In order to do so, we will:

  • Challenge 1 : create a Prediction API using FastAPI
  • Challenge 2 : create a Docker image containing the environment required in order to run the code of our API
  • Challenge 3 : push this image to Google Cloud Run so that it is instantiated as a Docker container that will run our code and allow developers all over the world to use it

Project setup

Environment

Copy your .env file from the previous package version:

cp ~/<user.github_nickname>/{{local_path_to('07-MLOps/03-Automate-model-lifecycle/01-Automate-model-lifecycle')}}/.env .env

OR

Use the env.sample provided, replacing the environment variable values by yours.

API directory

A new taxifare/api directory has been added to the project to contain the code of the API along with 2 new configuration files within the challenge project directory:

.
β”œβ”€β”€ Dockerfile          # 🎁  NEW Building instructions
β”œβ”€β”€ MANIFEST.in         # 🎁  NEW Config file for production purpose
β”œβ”€β”€ Makefile            # Good old task manager
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt    # All the dependencies you need to run the package
β”œβ”€β”€ setup.py            # Package installer
β”œβ”€β”€ taxifare
β”‚   β”œβ”€β”€ api             # 🎁  NEW API directory
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── fast.py     # 🎁  NEW Where the API lays
β”‚   β”œβ”€β”€ data_sources    # Data stuff
β”‚   β”œβ”€β”€ flow            # DAG stuff
β”‚   β”œβ”€β”€ interface       # Package entry point
β”‚   └── ml_logic        # ML stuff
└── tests

Now, have a look at the requirements.txt. You can see new comers:

# API
fastapi         # API framework
pytz            # Timezones management
uvicorn         # Web server
# tests
httpx           # HTTP client
pytest-asyncio  # Asynchronous I/O support for pytest

⚠️ Make sure perform a clean install of the package.

❓ How?

make reinstall_package of course πŸ˜‰

Running the API with FastAPI and a Uvicorn server

We provide you with with a FastAPI skeleton in the fast.py file.

πŸ’» Launch the API

πŸ’‘ Hint

You probably need a uvicorn web server..., with a πŸ”₯ reloading...

In case you can't find the proper syntax, keep calm and look at your Makefile, we provided you with a new task run_api.

❓ How do you consult your running API?

Answer

πŸ’‘ Your API is available on a local port, 8000 probably πŸ‘‰ http://localhost:8000. Go visit it!

You have probably not seen much.

❓ Which endpoints are available?

Answer

There is only one endpoint partially implemented at the moment, the root endpoint /. The unimplemented root page is a little raw, remember you can always find more info on the API using the swagger endpoint πŸ‘‰ http://localhost:8000/docs

Build the API

An API is defined by its specifications. E.g. GitHub repositories API. You will find below the API specifications you need to implement.

Specifications

Root

  • GET /
  • Response Status: 200
{
    'greeting': 'Hello'
}

πŸ’» Implement the Root endpoint /

πŸ‘€ Look at your browser πŸ‘‰ http://localhost:8000

πŸ› Inspect the server logs and add some breakpoint() to debug

Once and only once your API responds as required: πŸ§ͺ Test your implementation with make test_api_root

πŸš€ Commit and push your code!

Prediction

  • GET /predict
  • Query parameters
    Name Type Sample
    pickup_datetime DateTime 2013-07-06 17:18:00
    pickup_longitude float -73.950655
    pickup_latitude float 40.783282
    dropoff_longitude float -73.950655
    dropoff_latitude float 40.783282
    passenger_count int 2
  • Response Status 200
  • Code sample
    GET http://localhost:8000/predict?pickup_datetime=2013-07-06 17:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2
    Example response:
    {
        'fare_amount': 5.93
    }

❓ How would you proceed to implement the /predict endpoint? πŸ’¬ Discuss with your buddy.

⚑️ Kickstart pack Here is a piece of code you can use to kickstart the implementation:
```python
@app.get("/predict")
def predict(pickup_datetime: datetime,  # 2013-07-06 17:18:00
            pickup_longitude: float,    # -73.950655
            pickup_latitude: float,     # 40.783282
            dropoff_longitude: float,   # -73.984365
            dropoff_latitude: float,    # 40.769802
            passenger_count: int):
    pass # YOUR CODE HERE
```
πŸ’‘ Hints

Ask yourselves the following questions:

  • How should we handle the query parameters?
  • How can we re-use the taxifare model package?
  • How should we build X_pred? What does it look like?
  • How to render the correct response?
βš™οΈ Configuration

Have you put a trained model in Production in mlflow? If not, you can use the following configuration:

``` bash
MODEL_TARGET=mlflow
MLFLOW_TRACKING_URI=https://mlflow.lewagon.ai
MLFLOW_EXPERIMENT=taxifare_experiment_krokrob
MLFLOW_MODEL_NAME=taxifare_krokrob
```
πŸ” Food for thought
  1. Investigate the data types of the query parameters, you may need to convert them into the types the model requires
  2. Of course you must re-use the deepCab.interface.main.pred() or the deepCab.ml_logic.registry.load_model() functions!
  3. In order to make a prediction with the trained model, you must provide a valid X_pred but the key is missing!
  4. FastAPI can only render data type from the Python Standard Library, you may need to convert y_pred to match this requirement

πŸ‘€ Inspect your browser response πŸ‘‰ http://localhost:8000/predict?pickup_datetime=2013-07-06%2017:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2

πŸ› Inspect the server logs and add some breakpoint() to debug

Once and only once your API responds as required: πŸ§ͺ Test your implementation with make test_api_predict

πŸš€ Commit and push your code!

πŸ‘ Congrats, you build your first ML predictive API!

Build a Docker image for our API

We now have a working predictive API which can be queried from our local machine.

We want to make it available to the world. In order to do that, the first step is to create a Docker image that contains the environment required to run the API and make it run locally on Docker.

❓ What are the 3 steps to run the API on Docker?

Answer
  1. Create a Dockerfile containing the the instructions to build the API
  2. Build the image locally on Docker
  3. Run the API on Docker locally to check it is responding as required

Setup

You need Docker daemon to run on your machine so you will be able to build and run the image locally.

πŸ’» Launch Docker daemon

MacOSX

Launch the Docker Desktop app, you should see a whale in your menu bar.

verify that Docker Desktop is running

Windows WSL2 & Ubuntu

Launch the Docker app.

verify that Docker Desktop is running

βœ… Check Docker daemon is up and running with docker info in your terminal

A nice stack of logs should print:

Dockerfile

As a reminder, here is the project directory structure:

.
β”œβ”€β”€ Dockerfile          # πŸ‘‰ Building instructions
β”œβ”€β”€ MANIFEST.in         # πŸ†• Config file for production purpose
β”œβ”€β”€ Makefile            # Good old task manager
β”œβ”€β”€ README.md           # Package documentation
β”œβ”€β”€ requirements.txt    # All the dependencies you need to run the package
β”œβ”€β”€ setup.py            # Package installer
β”œβ”€β”€ taxifare
β”‚   β”œβ”€β”€ api             # βœ… API directory
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── fast.py     # βœ… Where the API lays
β”‚   β”œβ”€β”€ data_sources    # Data stuff
β”‚   β”œβ”€β”€ flow            # DAG stuff
β”‚   β”œβ”€β”€ interface       # Package entry point
β”‚   └── ml_logic        # ML logic
└── tests               # Your favorite πŸ”

❓ What are the key ingredients a Dockerfile needs to cook a delicious Docker image?

Answer

Here the most common instructions of good Dockerfile:

  • FROM: select a base image for our image (the environment in which we will run our code), this is usually the first instruction
  • COPY: copy files and directories inside of our image (our package and the associated files for example)
  • RUN: execute a command inside of the image being built (for example, install the package dependencies)
  • CMD: execute the main command that will be executed when we run our Docker image. There can be only one CMD instruction inside of a Dockerfile. It is usually the last instruction

❓ What the base image should contain to build our image on top of it?

πŸ’‘ Hints

Choosing an image with Python already installed could be a nice start...

πŸ’» Write the instructions needed to build the API image in the Dockerfile with the following specifications:

  • βœ… it should contain the same Python version of your virtual env
  • βœ… it should contain the necessary directories from the /taxifare directory to allow the API to run
  • βœ… it should contain the dependencies list
  • βœ… the API depencies should be installed
  • βœ… the web server should be launched when the container is started from the image
  • βœ… the web server should listen to the HTTP requests coming from outside the container (cf host parameter)
  • βœ… the web server should be able listen to a specific port defined by an environment variable $PORT (cf port parameter)
⚑️ Kickstart pack

Here is the skeleton of the Dockerfile:

FROM image
COPY taxifare
COPY dependencies
RUN install dependencies
CMD launch API web server
🚨 Apple Silicon users, expand me and read carefully

You will not be able to test your container locally with the tensorflow package since the current version does not install properly on Apple Silicon machines.

The solution is to use one image to test your code locally and another one to push your code to production.

πŸ‘‰ Refer to the commands in the Dockerfile_silicon file in order to build and test your local image and build and deploy to production your production image

❓ How would you check if the Dockerfile instructions will execute what you wanted?

Answer

You can't at this point! 😁 You need to build the image and check if it contains everything required to run the API. Go to the next section: Build the API image.

Build the API image

Now is the time to build the API image on Docker so you can check if it satisfies the requirements and be able to run it on Docker.

❓ How do you build an image with Docker?

βš™οΈ Configuration

You may add a variable to your project configuration for the docker image name. You will be able to reuse it in the docker commands:

IMAGE=image-name
Answer

Make sure you are in the directory of the Dockefile then:

docker build --tag=$IMAGE .

πŸ’» Choose a meaningful name for the API image then build it

Once built, the image should be visible in the list of images built with the following command:

docker images

πŸ•΅οΈβ€β™€οΈ The image you are looking for does not appear in the list? Ask for help πŸ™‹β€β™‚οΈ

Check the API image

Now the image is built let's check it satisfies the specifications to run the predictive API. Docker comes with a handy command to interactively communicate with the shell of the image:

docker run -it -e PORT=8000 -p 8000:8000 $IMAGE sh
πŸ€– Decrypt
  • docker run $IMAGE run the image
  • -it enable the interactive mode
  • -e PORT=8000 specify the environment variable $PORT the image should listen to
  • sh launch a shell console

A shell console should open, you are inside the image πŸ‘.

πŸ’» Check the image is correctly set up:

  • βœ… The python version is the same as your virtual env
  • βœ… Presence of the /taxifare directory
  • βœ… Presence of the requirements.txt
  • βœ… The dependencies are all installed
πŸ™ˆ Solution
  • python --version to check the Python version
  • ls to check the presence of the files and directories
  • pip list to check the requirements are installed

Exit the terminal and stop the container at any moment with:

exit

βœ… ❌ All good? If something is missing, you would probably need to fix your Dockerfile and re-build the image again

Run the API image

In the previous section you learned how to interact with the image shell. Now is the time to run the predictive API image and test if the API responds as it should.

πŸ’» Run the image

πŸ’‘ Hints

You should probably remove the interactivity mode and forget the sh command...

πŸ› Unless you find the correct command to run the image, it is probably crashing with errors involving environment variable.

❓ What is the difference between your local environment and image environment? πŸ’¬ Discuss with your buddy.

Answer

There is no .env in the image!!! The image has no access to the environment variables 😈

πŸ’» Using the docker run --help documentation, adapt the run command so the .env is sent to the image

πŸ™ˆ Solution

The --env-file parameter to the rescue!

docker run -e PORT=8000 -p 8000:8000 --env-file path/to/.env $IMAGE

❓ How would check the image runs correctly?

πŸ’‘ Hints

The API should respond in your browser, go visit it!

Also you can check the image runs with:

docker ps

It's Alive! 😱 πŸŽ‰

πŸ‘€ Inspect your browser response πŸ‘‰ http://localhost:8000/predict?pickup_datetime=2013-07-06%2017:18:00&pickup_longitude=-73.950655&pickup_latitude=40.783282&dropoff_longitude=-73.984365&dropoff_latitude=40.769802&passenger_count=2

πŸ‘ Congrats, you build your first ML predictive API inside a Docker container!

Deploy the API

Now we have built a predictive API Docker image that we are able to run on our local machine, we are 2 steps away from deploying:

  • Push the Docker image to Google Container Registry
  • Deploy the image on Google Cloud Run so that it gets instantiated into a Docker container

Lightweigth image

As a responsible ML Engineer, you know the size of an image is important when it comes to production. Depending the choice of the base image you used in your Dockerfile, the API image could be huge:

  • python:3.8.12-buster πŸ‘‰ 3.9GB
  • python:3.8.12-slim πŸ‘‰ 3.1GB
  • python:3.8.12-alpine πŸ‘‰ 3.1GB

❓ What is the heaviest requirement used by your API?

Answer

No doubt it is tensorflow with 1.1GB! You need to find a base image that is optimized for it.

πŸ“ Change your base image

πŸ’» Build and run a lightweight local image of your API

βœ… Make sure the API is still up and running

πŸ‘€ Inspect the space saved with docker images and feel happy

Hints

You may want to use a tensorflow docker image.

Push our prediction API image to Google Container Registry

❓ What is the purpose of Google Container Registry ?

Answer

Google Container Registry is a service storing Docker images on the cloud with the purpose of allowing Cloud Run or Kubernetes Engine to serve them.

It is in a way similar to GitHub allowing you to store your git repositories in the cloud (except for the lack of a dedicated user interface and additional services such as forks and pull requests).

Setup

First, let's make sure to enable Google Container Registry API for your project in GCP.

Once this is done, let's allow the docker command to push an image to GCP.

gcloud auth configure-docker

Build and push the image on GCR

Now we are going to build our image again. This should be pretty fast since Docker is pretty smart and is going to reuse all the building blocks used previously in order to build the prediction API image.

Add a GCR_MULTI_REGION variable to your project configuration and set it to eu.gcr.io.

docker build -t $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE .

Again, let's make sure that our image runs correctly, so that we avoid spending the time on pushing an image that is not working to the cloud.

docker run -e PORT=8000 -p 8000:8000 --env-file path/to/.env $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE

Visit http://localhost:8000/ and check the API is running as expected.

We can now push our image to Google Container Registry.

docker push $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE

The image should be visible in the GCP console here.

Deploy the Container Registry image to Google Cloud Run

Add a MEMORY variable to your project configuration and set it to 2Gi.

πŸ‘‰ This will allow your container to run with 2GB of memory

❓ How does Cloud Run know the value of the environment variables to pass to your container? πŸ’¬ Discuss with your buddy.

Answer

It does not. You need to provide a list of environment variables to your container when you deploy it 😈

πŸ’» Using the gcloud run deploy --help documentation, identify a parameter allowing to pass environment variables to your container on deployment

πŸ™ˆ Solution

The --env-vars-file is the correct one!

gcloud run deploy --env-vars-file .env.yaml

Tough luck, the --env-vars-file parameter takes as input the name of a yaml file containing the list of environment variables to pass to the container.

πŸ’» Create a .env.yaml file containing the list of environment variables to pass to your container

You can use the provided .env.sample.yaml file as a source for the syntax (do not forget to update the value of the parameters).

πŸ™ˆ Solution

Create a new .env.yaml file containing the values of your .env file in the yaml format:

DATASET_SIZE: 10k
VALIDATION_DATASET_SIZE: 10k
CHUNK_SIZE: "2000"

πŸ‘‰ All values should be strings

❓ What is the purpose of Cloud Run?

Answer

Cloud Run will instantiate the image into a container and run the CMD instruction inside of the Dockerfile of the image. This last step will start the uvicorn server serving our predictive API to the world 🌍

Let's run one last command 🀞

gcloud run deploy --image $GCR_MULTI_REGION/$GCP_PROJECT_ID/$IMAGE --memory $MEMORY --region $GCR_REGION --env-vars-file .env.yaml

After confirmation, you should see a similar output indicating that the service is live πŸŽ‰

Service name (wagon-data-tpl-image):
Allow unauthenticated invocations to [wagon-data-tpl-image] (y/N)?  y

Deploying container to Cloud Run service [wagon-data-tpl-image] in project [le-wagon-data] region [europe-west1]
βœ“ Deploying new service... Done.
  βœ“ Creating Revision... Revision deployment finished. Waiting for health check to begin.
  βœ“ Routing traffic...
  βœ“ Setting IAM Policy...
Done.
Service [wagon-data-tpl-image] revision [wagon-data-tpl-image-00001-kup] has been deployed and is serving 100 percent of traffic.
Service URL: https://wagon-data-tpl-image-xi54eseqrq-ew.a.run.app

Any developer in the world 🌍 is now able to browse to the deployed url and make a prediction using the API πŸ€–!

⚠️ Keep in mind that you pay for the service as long as it is up πŸ’Έ

πŸ‘ Congrats, you deployed your first ML predictive API!

Once you are done with Docker...

You may stop (or kill) the image...

docker stop 152e5b79177b  # ⚠️ use the correct CONTAINER ID
docker kill 152e5b79177b  # ☒️ only if the image refuses to stop (did someone create an ∞ loop?)

Remember to stop the Docker daemon in order to free resources on your machine once you are done using it...

MacOSX

Stop the Docker.app with Quit Docker Desktop in the menu bar.

Windows WSL2/Ubuntu

Stop the Docker app.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published