Skip to content

Feast GCP tutorial using BigQuery / Datastore to train / serve a driver ranking model

Notifications You must be signed in to change notification settings

feast-dev/feast-gcp-driver-ranking-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feast Driver Ranking Example

Overview

Making a prediction using a linear regression model is a common use case in ML. In this guide tutorial, we build the model that predicts if a driver will complete a trip based on a number of features ingested into Feast.

The basic local mode gives you ability to quickly try Feast, while the advanced mode shows how you can use Feast in a production setting, in particular for the Google Cloud Platform (GCP) cloud.

This tutorial uses Feast with Scikit Learn to

  1. Train a model locally using data from BigQuery
  2. Test the model for online inference using SQLite (for fast iteration)
  3. Test the model for online inference using Firestore (to represent production)

Prerequisites

To successfully run this tutorial, it requires that you have an account on GCP and have access to read and write permissions to BigQuery. Also, you need to install Google Cloud CLI for your localhost platform.

Tutorial

  1. Install Feast and scikit-learn
pip install feast scikit-learn 'feast[gcp]'

(This tutorial has been tested with Feast==0.11.0)

  1. Set up a local feature store (on your laptop).
cd driver_ranking/
feast apply
cd ..
  1. Train a model
python train.py
  1. Load data into your local sqlite online store
cd driver_ranking/
feast materialize-incremental 2022-01-01T00:00:00
cd ..
  1. Test your model with your local sqlite online store
python predict.py
  1. Set up your production feature store with GCP (uses Google Firestore)

Ensure that Google cloud has been configured

gcloud config set project SET_YOUR_GCP_PROJECT_HERE
gcloud auth application-default login

Change the provider field in driver_ranking/feature_store.yaml from local to gcp

Then apply and materialize data to Firestore

cd driver_ranking/
feast apply
feast materialize-incremental 2022-01-01T00:00:00
cd ..
  1. Test your model with your remote Firestore online store
python predict.py

Advanced

For production use its preferred to use a Google Cloud Storage based registry instead of a local repository. This allows multiple production systems to share the same source of truth for feature definitions.

Change feature_store.yaml to

project: driver_ranking
registry: gs://my-feature-store-bucket/registry.db
provider: gcp

Change predict.py and train.py to

self.fs = feast.FeatureStore(
    config=RepoConfig(
        project="driver_ranking",
        provider="gcp",
        registry="gs://my-feature-store-bucket/registry.db",
    )
)

About

Feast GCP tutorial using BigQuery / Datastore to train / serve a driver ranking model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •