Skip to content

sayakpaul/Dual-Deployments-on-Vertex-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dual-Deployments-on-Vertex-AI

By Chansung Park and Sayak Paul

This project demonstrates a workflow to cover dual model deployment scenarios using Kubeflow, TensorFlow Extended (TFX), and Vertex AI. We suggest reading the accompanying blog post first to get an idea and then following along with the code. This project also received the #TFCommunitySpotlight Award.

Motivation 💻

Let's say you want to allow your users to run an application both in online and offline mode. Your mobile application would use a TFLite model depending on the network bandwidth/battery etc., and if sufficient network coverage/internet bandwidth is available your application would instead use the online cloud one. This way your application stays resilient and can ensure high availability.

Sometimes we also do layered predictions where we first divide a problem into smaller tasks:

  1. predict if it's a yes/no,
  2. depending on the output of 1) we run the final model.

In these cases, 1) takes place on-device and 2) takes place on the cloud to ensure a smooth UX. Furthermore, it's a good practice to use a mobile-friendly network architecture (such as MobileNet) when considering mobile deployments. This leads us to the following question:

Can we train two different models within the same deployment pipeline and manage them seamlessly?

This project is motivated by this question.

AutoML, TFX, etc. 🛠

Different organizations have people with varied technical backgrounds. We wanted to provide the easiest solution first and then move on to something that is more customizable. To this end, we leverage Kubeflow's AutoML SDKs to build, train, and deploy models with different production use-cases. With AutoML, the developers can delegate a large part of their workflows to the SDKs and the codebase also stays comparatively smaller. The figure below depicts a sample system architecture for this scenario:

Figure developed by Chansung Park.

But the story does not end here. What if we wanted to have better control over the models to be built, trained, and deployed? Enter TFX! TFX provides the flexibility of writing custom components and including them inside a pipeline. This way Machine Learning Engineers can focus on building and training their favorite models and delegate a part of the heavy lifting to TFX and Vertex AI. On Vertex AI (acting as an orchestrator) this pipeline will look like so:

🔥 In this project we cover both these situations. 

Code 🆘

Our code is distributed as Colab Notebooks. But one needs to have a billing-enabled GCP account (with a few APIs enabled) to successfully run these notebooks. Alternatively one can also use the notebooks on Vertex AI Notebooks. Find all the notebooks and their descriptions here: notebooks.

Additionally, you can find the custom TFX components separately here - custom_components.

Acknowledgements

ML-GDE program for providing GCP credits. Thanks to Karl Weinmeister and Robert Crowe for providing review feedback on this project.