diff --git a/blog_post.html b/blog_post.html index 5ed686b..5c34496 100644 --- a/blog_post.html +++ b/blog_post.html @@ -3093,74 +3093,74 @@

How Twilio used Amazon SageMaker MLOps Pipelines with PrestoDB -

Amit Arora, Madhur Prashant, Antara Raisa

+

Amit Arora, Madhur Prashant, Antara Raisa, Johnny Chivers

This post is co-written with customer_names from Twilio.

-

Machine learning (ML) models do not operate in isolation. To deliver value, they must integrate into existing production systems and infrastructure, which necessitates considering the entire ML lifecycle during design and development. ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Building a robust MLOps pipeline demands cross-functional collaboration. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance. With the right processes and tools, MLOps enables organizations to reliably and efficiently adopt ML across their teams.

-

Amazon SageMaker MLOps is a suite of features that includes Amazon SageMaker Projects (CI/CD), Amazon SageMaker Pipelines and Amazon SageMaker Model Registry.

+

Machine learning (ML) models do not operate in isolation. To deliver value, they must integrate into existing production systems and infrastructure, which necessitates considering the entire ML lifecycle during design and development. ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Building a robust MLOps pipeline demands cross-functional collaboration. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance. With the right processes and tools, MLOps enables organizations to reliably and efficiently adopt ML across their teams for their specific use cases.

+

Amazon SageMaker MLOps is a suite of features that includes Amazon SageMaker Projects (CI/CD), Amazon SageMaker Pipelines and Amazon SageMaker Model Registry. In this blog post, we will discuss SageMaker Pipelines and Model Registry.

SageMaker Pipelines allows for straightforward creation and management of ML workflows, while also offering storage and reuse capabilities for workflow steps. The SageMaker Model Registry centralizes model tracking, simplifying model deployment.

This blog post focuses on enabling AWS customers to have flexibility for using their data source of choice, and integrate it seamlessly with Amazon SageMaker Processing jobs, where you can leverage a simplified, managed experience to run data pre- or post-processing and model evaluation workloads on the Amazon SageMaker platform.

-

Twilio needed to implement an MLOPs pipeline with their customer data stored within PrestoDB. PrestoDB is an open-source SQL query engine that is designed for fast analytic queries against data of any size.

+

Twilio needed to implement an MLOPs pipeline and query data as a part of this process from PrestoDB. PrestoDB is an open-source SQL query engine that is designed for fast analytic queries against data of any size from multiple sources.

In this post, we show you a step-by-step implementation to achieve the following:

Use case overview

-

Twilio is an american cloud communications company, based in San Francisco, California and provides programmable communication tools for making and receiving phone calls, sending and receiving text messages, and performing other communication functions using its web service APIs. Being one of the largest strategic AWS customers, Twilio engages with Data and AI/ML servives to run their daily workloads. This blog resolves around the steps AWS and Twilio took to migrate Twilio’s MLOps, implementation of training models and running batch inferences (that are able to detect burner accounts based on unusual user activity) to Amazon SageMaker.

-

Burners are phone numbers which are available online to everyone and are used to hide identities by creating fake accounts on our customers’ apps/websites. Twilio built an effective data and machine learning operations pipeline to detect these anomaly phone numbers with the help of a binary classification model using the scikit-learn RandomForestClassifier. The training data they used for this pipeline is available via PrestoDB tables and is read into Pandas through the PrestoDB Python client. This data is then read into an Apache Spark dataframe (although the model training happens only using the data in the Pandas dataframe).

-

The end goal was to convert all the existing steps into three sub solutions utilizing SageMaker Pipelines: a training pipeline, batch inference pipeline and finally, to deploy the trained model on an Amazon SageMaker Endpoint for real-time inference.

+

Twilio is an american cloud communications company, based in San Francisco, California and provides programmable communication tools for making and receiving phone calls, sending and receiving text messages, and performing other communication functions using its web service APIs. Being one of the largest AWS customers, Twilio engages with Data and AI/ML servives to run their daily workloads. This blog resolves around the steps AWS and Twilio took to migrate Twilio’s existing MLOps, implementation of training models and running batch inferences (that are able to detect burner accounts based on unusual user activity) to Amazon SageMaker.

+

Burners are phone numbers which are available online to everyone and are used to hide identities by creating fake accounts on customers’ apps/websites. Twilio built a data and machine learning operations pipeline to detect these anomaly phone numbers with the help of a binary classification model using the scikit-learn RandomForestClassifier. The training data they used for this pipeline is made available via PrestoDB tables and is read into Pandas through the PrestoDB Python client. This data is then read into an Apache Spark dataframe for further analysis and machine learning operations.

+

The end goal was to convert all the existing steps into a three fold solution utilizing SageMaker Pipelines to enable more frequent model re-training, optimized batch processing, while customers take advatage of flexibility of data access via an open-source SQL query engine: 1/Implement a training pipeline and 2/batch inference pipeline (by connecting a sagemaker processing job with data queried from Presto). 3/Finally, we also demonstrate deploying the trained model on a SageMaker Endpoint for real-time inference.

For the proof of concept, we used the TPCH-Connector as our choice of open source data (this allows users to test Presto’s capabilities and query syntax without needing to configure access to an external data source). Using this solution, Twilio successfully migrated to SageMaker pipelines with the open source solution that can be viewed here published on aws-samples github: mlops-pipeline-prestodb.

Solution overview

-

The solution presented provides an implementation for training a machine learning model and running batch inference on Amazon SageMaker. The goal is to enable more frequent model re-training, optimized batch processing, and even add additional capabilities such as making the model available as a real-time endpoint. This solution also provides a design pattern built on AWS best practices that can be replicated for other ML workloads with minimal overhead. This is divided into three main steps: training pipeline, batch inference pipeline, and an implementation of real time inference support for the choice of maching learning model.

-

We have divided the solution into the following main components that are open sourced and can be run through simple config file updates.

-

This solution includes the following components:

+

The solution presented provides an implementation for training a machine learning model and running batch inference on Amazon SageMaker using data fetched from a PrestoDB table. This solution provides a design pattern built on AWS best practices that can be replicated for other ML workloads with minimal overhead. This is divided into three main steps: training pipeline, batch inference pipeline, and an implementation of real time inference support for the choice of maching learning model.

+

This solution is now open source and can be run through simple config file updates. For more information on the config.yml file walkthrough, view this link (add a link here poiting to the config file).

+

This solution includes the following steps:

Solution design

-

The solution design consists of three parts - Setting up the data preparation and training pipeline, preparing for the batch transform step, and deploying the approved model of choice as a real time SageMaker endpoint for inference. All of these parts utilize information from a single config.yml file, which includes the AWS and Presto credential information, Individial step pipeline parameters for the data preprocessing, training, tuning, model evaluation, model registeration and real time endpoint steps of this solution. This configuration file is highly customizable for the user to use and run the solution end to end with minimal-no code changes.

-

The main components of this solution are as describe below:

-
-

Data Preparation and Training Pipeline Step:

+

The solution design consists of the following parts - Setting up the data preparation and training pipeline, preparing for the batch transform step, and deploying the approved model of choice as a real time SageMaker endpoint for inference. All of these parts utilize information from a single config.yml file, which includes the necessary AWS and Presto credential information to connect to a presto server on an EC2 instance, Individial step pipeline parameters for the data preprocessing, training, tuning, model evaluation, model registeration and real time endpoint steps of this solution. This configuration file is highly customizable for the user to use and run the solution end to end with minimal-no code changes.

+

The main components of this solution are as described in detail below:

+
+

Part 1 - Data Preparation and Training Pipeline Step:

    -
  1. The training data is read from PrestoDB and any feature engineering needed is done as part of the SQL queries run in PrestoDB at retrieval time.
  2. -
  3. We use a FrameworkProcessorwith SageMaker Processing Jobs to read data from PrestoDB using the Python PrestoDB client. The existing code for reading data from PrestoDB include the queries used remains unchanged from the current implementation.
  4. -
  5. For the training and tuning step, we use the SKLearn estimator from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The HyperparameterTunerclass is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance (maximize the AUC metric).
  6. +
  7. The training data is read from a PrestoDB server started on an EC2 instance, and any feature engineering needed is done as part of the SQL queries run in PrestoDB at retrieval time. The queries used to fetch data at the training and batch inference step can be configured in the config file here.
  8. +
  9. We use a FrameworkProcessor with SageMaker Processing Jobs to read data from PrestoDB using the Python PrestoDB client.
  10. +
  11. For the training and tuning step, we use the SKLearn estimator from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The HyperparameterTunerclass is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance for a given use case (for example, maximize the AUC metric).
  12. The model evaluation step is to check that the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently approved and deployed). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry.
  13. The model training pipeline is then run with the Pipeline.start which triggers and instantiates all steps above.
-
-

Batch Transform Step:

+
+

Part 2 - Batch Transform Step:

    -
  1. The batch inference pipeline consists of two steps: a data preparation step that retrieves the data from the PrestoDB and stores it in S3 (same implementation as in the training pipeline mentioned above) and a batch transform step that runs inference on this data stored in S3 and stores the output data in S3.
  2. +
  3. The batch inference pipeline consists of two steps: a data preparation step that retrieves the data from PrestoDB (using a batch data preprocess script that connects and fetches data from the presto server deployed on EC2) and stores it in S3 (same implementation as in the training pipeline mentioned above). After this, a batch transform step runs inference on this data stored in S3 and stores the output data in S3.
  4. In this step, we utilize the transformer instance and the TransformInput with the batch_data pipeline parameter defined.
-
-

Real Time SageMaker endpoint support:

+
+

Part 3 - Real Time SageMaker endpoint support:

  1. The latest approved model from the model registry is deployed as a realtime endpoint.
  2. -
  3. The latest approved model is retrieved from the registry using the describe_model_package function from the SageMaker SDK.
  4. -
  5. The model is deployed on a ml.c5.xlarge instances with a minimum instance count of 1 and maximum instance count of 3 (configurable by the user) and automatic scaling policy configured.
  6. +
  7. The latest approved model is retrieved from the registry using the describe_model_package function from the SageMaker SDK.
  8. +
  9. The model is deployed on a ml.c5.xlarge instance with a minimum instance count of 1 and maximum instance count of 3 (configurable by the user) and automatic scaling policy configured.

Prerequisites

-

To implement the solution provided in this post, you should have an AWS account and familarity with SageMaker, Amazon S3, and PrestoDB.

+

To implement the solution provided in this post, you should have an AWS account and familarity with SageMaker, S3, and PrestoDB.

The following prerequisites need to be in place before running this code.

PrestoDB

-

We will use the built-in datasets available in PrestoDB for this repo. Following the instructions below to setup PrestoDB on an Amazon EC2 instance in your account. If you already have access to a PrestoDB instance then you can skip this section but keep its connection details handy (see the presto section in the config file).

+

We will use the built-in datasets available in PrestoDB for this repo. Follow the instructions below to setup PrestoDB on an Amazon EC2 instance in your account. If you already have access to a PrestoDB instance then you can skip this section but keep its connection details handy (see the presto section in the config file).

  1. Create a security group to limit access to Presto. Create a security group called MyPrestoSG with two inbound rules to only allow access to Presto.

      @@ -3259,36 +3259,38 @@

      Steps to run

Testing the solution

-

Once the prerequisite steps are complete and the config.yml file is set up correctly, we are now ready to run the “mlops-pipeline-prestodb” implementation:

+

Once the prerequisites and set up is complete and the config.yml file is set up correctly, we are now ready to run the mlops-pipeline-prestodb implementation. Follow the steps below or access the github repository to walk through the solution:

  1. On the SageMaker console, or your IDE of choice, choose 0_model_training_pipeline.inpynb in the navigation pane. When the notebook is open, on the Run menu, choose Run All Cells to run the code in this notebook. This notebook demonstrates how SageMaker Pipelines can be used to string together a sequence of data processing, model training, tuning and evaluation step to train a binary classification machine learning model using scikit-learn. The trained model can then be used for batch inference, or hosted on a SageMaker endpoint for realtime inference.

      -
    • Preprocess data step: In this step of the notebook, we set our pipeline input parameters when triggering our pipeline execution. We use a preprocess script which is read to connect to presto and query data, that is then sent to an Amazon S3 bucket split into train, test and validation datasets. Using these files, this step can then use the data for training the model.

      +
    • Preprocess data step: In this step of the notebook, we set our pipeline input parameters when triggering our pipeline execution. We use a preprocess script which is read to connect to the presto server on our EC2 instance, and query data (using the query specified and configurable in the config file), that is then sent to an S3 bucket split into train, test and validation datasets. Using the data in these files, we can train our machine learning model.

        -
      • We use the sklearn_processor in a SageMaker Pipelines ProcessingStep and define it as given below:
      • +
      • We use the sklearn_processor in a SageMaker Pipelines ProcessingStep and define it as given below:
      -
      step_args = sklearn_processor.run(
      -        code=config['scripts']['preprocess_data'],
      -        source_dir=config['scripts']['source_dir'], 
      -        outputs=outputs_preprocessor,
      -        arguments=[
      -            "--host", host_parameter,
      -            "--port", port_parameter,
      -            "--presto_credentials_key", presto_parameter,
      -            "--region", region_parameter,
      -            "--presto_catalog", presto_catalog_parameter,
      -            "--presto_schema", presto_schema_parameter,
      -            "--train_split", train_split.to_string(), 
      -            "--test_split", test_split.to_string(),
      -        ],
      -    )
      -
      -    step_preprocess_data = ProcessingStep(
      -        name=config['data_processing_step']['step_name'],
      -        step_args=step_args,
      -    )
      +
      # declare the sk_learn processer
      +step_args = sklearn_processor.run(
      +        ## code refers to the data preprocessing script that is responsible for querying data from the presto server
      +        code=config['scripts']['preprocess_data'],
      +        source_dir=config['scripts']['source_dir'], 
      +        outputs=outputs_preprocessor,
      +        arguments=[
      +            "--host", host_parameter,
      +            "--port", port_parameter,
      +            "--presto_credentials_key", presto_parameter,
      +            "--region", region_parameter,
      +            "--presto_catalog", presto_catalog_parameter,
      +            "--presto_schema", presto_schema_parameter,
      +            "--train_split", train_split.to_string(), 
      +            "--test_split", test_split.to_string(),
      +        ],
      +    )
      +
      +    step_preprocess_data = ProcessingStep(
      +        name=config['data_processing_step']['step_name'],
      +        step_args=step_args,
      +    )
        -
      • We are using the config['scripts']['source_dir'] which refers to our data preprocessing script that connects to the EC2 instance where the presto server runs. This script is responsible for extracting data from the query that you define. You can query the data you want by modifying the query parameter in the config file query parameter. We are using the sample query as an example to extract open source TPCH data on orders, discounts and order priorities.
      • +
      • We are using the config['scripts']['source_dir'] which refers to our data preprocessing script that connects to the EC2 instance where the presto server runs. This script is responsible for extracting data from the query that you define. You can query the data you want by modifying the query parameter in the config.yml file query parameter. We are using the sample query as an example to extract open source TPCH data on orders, discounts and order priorities.
       SELECT
           o.orderkey,
      @@ -3313,64 +3315,67 @@ 

      Testing the solution< ORDER BY RANDOM() LIMIT 5000

    • -
    • Train Model Step: In this step of the notebook, we use the SKLearn estimator from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The HyperparameterTunerclass is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance (maximize the AUC metric). We use the train, test and validation files that are sent to an S3 bucket after querying it from PrestoDB.

    • -
    • In the code below, the sklearn_estimator object is created with parameters that are configured in the config file and uses this training script to train the ML model. The hyperparameters are also configurable by the user via the config file.

      +
    • Train Model Step: In this step of the notebook, we use the SKLearn estimator from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The HyperparameterTunerclass is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance based on a given metric threshold (maximize the AUC metric).

    • +
    • In the code below, the sklearn_estimator object is created with parameters that are configured in the config file and uses this training script to train the ML model. The hyperparameters are also configurable by the user via the config file. This step accesses the train, test and validation files that are created as a part of the previous data preprocessing step.

      sklearn_estimator = SKLearn(
      -    entry_point=config['scripts']['training_script'],
      -    role=role,
      -    instance_count=config['training_step']['instance_count'],
      -    instance_type=config['training_step']['instance_type'],
      -    framework_version=config['training_step']['sklearn_framework_version'],
      -    base_job_name=config['training_step']['base_job_name'],
      -    hyperparameters={
      -        "n_estimators": config['training_step']['n_estimators'],
      -        "max_depth": config['training_step']['max_depth'],  
      -        "features": config['training_step']['training_features'],
      -        "target": config['training_step']['training_target'],
      -    },
      -    tags=config['training_step']['tags']
      -)
      -# Create Hyperparameter tuner object. Ranges from https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost-tuning.html
      -rf_tuner = HyperparameterTuner(
      -                estimator=sklearn_estimator,
      -                objective_metric_name=config['tuning_step']['objective_metric_name'],
      -                hyperparameter_ranges={
      -                    "n_estimators": IntegerParameter(config['tuning_step']['hyperparam_ranges']['n_estimators'][0], config['tuning_step']['hyperparam_ranges']['n_estimators'][1]),
      -                    "max_depth": IntegerParameter(config['tuning_step']['hyperparam_ranges']['max_depth'][0], config['tuning_step']['hyperparam_ranges']['max_depth'][1]),
      -                    "min_samples_split": IntegerParameter(config['tuning_step']['hyperparam_ranges']['min_samples_split'][0], config['tuning_step']['hyperparam_ranges']['min_samples_split'][1]),
      -                    "max_features": CategoricalParameter(config['tuning_step']['hyperparam_ranges']['max_features'])
      -                },
      -                max_jobs=config['tuning_step']['maximum_training_jobs'], ## reducing this for testing purposes
      -                metric_definitions=config['tuning_step']['metric_definitions'],
      -                max_parallel_jobs=config['tuning_step']['maximum_parallel_training_jobs'], ## reducing this for testing purposes
      -)
      -
      -step_tuning = TuningStep(
      -    name=config['tuning_step']['step_name'],
      -    tuner=rf_tuner,
      -    inputs={
      -        "train": TrainingInput(
      -            s3_data=step_preprocess_data.properties.ProcessingOutputConfig.Outputs[
      -                "train" ## refer to this
      -            ].S3Output.S3Uri,
      -            content_type="text/csv",
      -        ),
      -        "test": TrainingInput(
      -        s3_data=step_preprocess_data.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
      -        content_type="text/csv",
      -        ),
      -    },
      -)
    • -
    • Evaluate model step: The purpose of this step is to check that the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently approved and deployed). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry. We use the ScriptProcessor with an evaluation script that a user creates to evaluate the trained model based on metrics of choice.

      + # we configure the training script that accesses the train, test and validation files from the data preprocessing step + entry_point=config['scripts']['training_script'], + role=role, + instance_count=config['training_step']['instance_count'], + instance_type=config['training_step']['instance_type'], + framework_version=config['training_step']['sklearn_framework_version'], + base_job_name=config['training_step']['base_job_name'], + hyperparameters={ + # Hyperparameters are fetched and are configured in the config.yml file + "n_estimators": config['training_step']['n_estimators'], + "max_depth": config['training_step']['max_depth'], + "features": config['training_step']['training_features'], + "target": config['training_step']['training_target'], + }, + tags=config['training_step']['tags'] +) +# Create Hyperparameter tuner object. Ranges from https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost-tuning.html +rf_tuner = HyperparameterTuner( + estimator=sklearn_estimator, + objective_metric_name=config['tuning_step']['objective_metric_name'], + hyperparameter_ranges={ + "n_estimators": IntegerParameter(config['tuning_step']['hyperparam_ranges']['n_estimators'][0], config['tuning_step']['hyperparam_ranges']['n_estimators'][1]), + "max_depth": IntegerParameter(config['tuning_step']['hyperparam_ranges']['max_depth'][0], config['tuning_step']['hyperparam_ranges']['max_depth'][1]), + "min_samples_split": IntegerParameter(config['tuning_step']['hyperparam_ranges']['min_samples_split'][0], config['tuning_step']['hyperparam_ranges']['min_samples_split'][1]), + "max_features": CategoricalParameter(config['tuning_step']['hyperparam_ranges']['max_features']) + }, + max_jobs=config['tuning_step']['maximum_training_jobs'], ## reducing this for testing purposes + metric_definitions=config['tuning_step']['metric_definitions'], + max_parallel_jobs=config['tuning_step']['maximum_parallel_training_jobs'], ## reducing this for testing purposes +) + +# declare a tuning step to use the train and test data to tune the ML model using the `HyperparameterTuner` declared above +step_tuning = TuningStep( + name=config['tuning_step']['step_name'], + tuner=rf_tuner, + inputs={ + "train": TrainingInput( + s3_data=step_preprocess_data.properties.ProcessingOutputConfig.Outputs[ + "train" ## refer to this + ].S3Output.S3Uri, + content_type="text/csv", + ), + "test": TrainingInput( + s3_data=step_preprocess_data.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri, + content_type="text/csv", + ), + }, +)
    • +
    • Evaluate model step: The purpose of this step is to check if the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently approved and deployed). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry. We use the ScriptProcessor with an evaluation script that a user creates to evaluate the trained model based on a metric of choice.

        -
      • Once this step is run, an Evaluation Report is generated that is sent to the S3 bucket:
      • +
      • Once this step is run, an Evaluation Report is generated that is sent to the S3 bucket for analysis:
      
       evaluation_report = PropertyFile(
           name="EvaluationReport", output_name="evaluation", path=config['evaluation_step']['evaluation_filename']
       )
        -
      • The evaluation step uses the evaluation script as a code entry in the step below:
      • +
      • The evaluation step uses the evaluation script as a code entry in the step below. This script prepares the features and target values and calculates teh prediciton probabilities using model.predict. The evaluation report sent to S3 contains information on metrics like precision, recall, accuracy.
      step_evaluate_model = ProcessingStep(
           name=config['evaluation_step']['step_name'],
      @@ -3409,7 +3414,7 @@ 

      Testing the solution< "--features", feature_parameter, ] )

    • -
    • Register model step: Once the trained model meets the model performance requirements, a new version of the model is registered with the model registry for further analysis.

    • +
    • Register model step: Once the trained model meets the model performance requirements, a new version of the model is registered with the model registry for further analysis and model creation.

    # Crete a RegisterModel step, which registers the model with SageMaker Model Registry.
     step_register_model = RegisterModel(
    @@ -3426,7 +3431,7 @@ 

    Testing the solution< tags=config['register_model_step']['tags'] )

    The model is registered with the model Registry with approval status set to PendingManualApproval, this means the model cannot be deployed on a SageMaker Endpoint unless its status in the registry is changed to Approved manually via the SageMaker console, programmatically or through a Lambda function.

    -

    Adding conditions to the pipeline is done with a ConditionStep. In this case, we only want to register the new model version with the model registry if the new model meets a specific accuracy condition:

    +

    Adding conditions to the pipeline is done with a ConditionStep. In this case, we only want to register the new model version with the model registry if the new model meets a specific accuracy condition:

    
     step_fail = FailStep(
         name=config['fail_step']['step_name'],
    @@ -3450,10 +3455,11 @@ 

    Testing the solution< name=config['condition_step']['step_name'], conditions=[cond_gte], if_steps=[step_register_model], - else_steps=[step_fail], ## if this fails - add a step here (from the quip) + else_steps=[step_fail], ## if this fails )

    +

    If the accuracy condition is not met, a step_fail step is executed that sends an error message to the user and the pipeline fails.

      -
    • Orchestrate all steps and start the pipeline: Once you have created the pipeline steps as above, you can instantiate and start it with custom parameters making the pipeline agnostic to who is triggering it, but also to the scripts and data used. The pipeline can be started using the CLI, the SageMaker Studio UI or the SDK and below there is a screenshot of what it looks like in the SageMaker Studio UI.

      +
    • Orchestrate all steps and start the pipeline: Once you have created the pipeline steps as above, you can instantiate and start it with custom parameters making the pipeline agnostic to who is triggering it, but also to the scripts and data used. The pipeline can be started using the CLI, the SageMaker Studio UI or the SDK.

      # Start pipeline with credit data and preprocessing script
       execution = pipeline.start(
                       execution_display_name=pipeline.name,
      @@ -3474,11 +3480,11 @@ 

      Testing the solution< # print the summary of the pipeline run once it is completed print_pipeline_execution_summary(execution.list_steps(), pipeline.name)

    -

    At the end of the training pipeline, your pipeline structure on Amazon SageMaker should look like this:

    -

    Now that the model is registered, get access to the registered model manually on the sagemaker studio model registry console, or programmatically in the next notebook, approve it and run the second portion of this solution: Batch Transform Step

  2. +

    At the end of the executing the entire training pipeline, your pipeline structure on Amazon SageMaker Pipelines should look like this: Training Pipeline Structure

    +

    Now that the model is registered, you can get access to the registered model manually on the sagemaker studio model registry console, or programmatically in the next notebook, approve it and run the second portion of this solution: Batch Transform Step

  3. Next Choose 1_batch_transform_pipeline.ipynb. When the notebook is open, on the Run menu, choose Run All Cells to run the code in this notebook. This notebook will run a batch transform step using the model trained in the previous notebook. It does so by running the following steps:

      -
    • Extract the latest approved model from the SageMaker model registry: In this step, we first define pipeline input parameters that are used for the EC2 instance types to use for processing and training
    • +
    • Extract the latest approved model from the SageMaker model registry: In this step, we first define pipeline input parameters that are used for the EC2 instance types to use for processing and training steps. These parameters can be configured on the config.yml file.
    # What instance type to use for processing.
     processing_instance_type = ParameterString(
    @@ -3523,8 +3529,9 @@ 

    Testing the solution< ModelPackageArn=latest_model_package_arn, ModelApprovalStatus="Approved", )

    +

    Now we have extracted the latest model from the SageMaker Model Registry, and programmatically approved it. You can also approve the model manually on the SageMaker Model Registry page on SageMaker Studio as given in the screenshot below. SageMaker Model Registry: Manual Model Approval via SageMaker Studio

      -
    • Read raw data for inference from PrestoDB and stores in an Amazon S3 bucket: Once the latest model is approved, we get the latest batch data from presto and use that for our batch transform step. In this step, we use another batch preprocess script that is dedicated to reading and fetching data from presto and saving in a batch directory.
    • +
    • Read raw data for inference from PrestoDB and store in an Amazon S3 bucket: Once the latest model is approved, we get the latest batch data from presto and use that for our batch transform step. In this step, we use another batch preprocess script that is dedicated to reading and fetching data from presto and saving in a batch directory within our S3 bucket.
    ## represents the output processing for the batch pre processing step
     batch_output=[
    @@ -3546,25 +3553,26 @@ 

    Testing the solution< # Use the sklearn_processor's run method and configure the batch preprocessing step step_args = sklearn_processor.run( - code=config['scripts']['batch_transform_get_data'], - source_dir=config['scripts']['source_dir'], - outputs=batch_output, - arguments=[ - "--host", host_parameter, - "--port", port_parameter, - "--presto_credentials_key", presto_parameter, - "--region", region_parameter, - "--presto_catalog", presto_catalog_parameter, - "--presto_schema", presto_schema_parameter, - ], -) - -# declare the batch step that is called later in pipeline execution -batch_data_prep = ProcessingStep( - name=config['data_processing_step']['step_name'], - step_args=step_args, -)

    -

    Now, with the image uri, we refer to the ‘inference.py’ script that grabs information on features to use while making predictions. Using this, we will create the model which automatically trigger the training and the preprocess data step Run the transformer step on the created model.

    + # here, we add in a `code` or an entry point that uses the data preprocess script for collecting data in a batch and storing it in S3 + code=config['scripts']['batch_transform_get_data'], + source_dir=config['scripts']['source_dir'], + outputs=batch_output, + arguments=[ + "--host", host_parameter, + "--port", port_parameter, + "--presto_credentials_key", presto_parameter, + "--region", region_parameter, + "--presto_catalog", presto_catalog_parameter, + "--presto_schema", presto_schema_parameter, + ], +) + +# declare the batch step that is called later in pipeline execution +batch_data_prep = ProcessingStep( + name=config['data_processing_step']['step_name'], + step_args=step_args, +) +

    Once the batch data preparation step is complete, we declare a model with the image uri and refer to the ‘inference.py’ script that grabs information on features to use while making predictions. Using this, we will create the model which automatically trigger the training and the preprocess data step Run the transformer step on the created model.

    # create the model image based on the model data and refer to the inference script as an entry point for batch inference
     model = Model(
         image_uri=image_uri,
    @@ -3615,85 +3623,100 @@ 

    Testing the solution< # start the pipeline execution: response = sagemaker_client.start_pipeline_execution( PipelineName=batch_transform_pipeline.name -)

    -

    Once the pipeline run has completed, run batch inference and view the output below:

    -
    while True:
    -    resp = client.describe_pipeline_execution(
    -    PipelineExecutionArn=response['PipelineExecutionArn']
    -        )
    -    status = resp['PipelineExecutionStatus']
    -

    At the end of the batch transform pipeline, your pipeline structure on Amazon SageMaker should look like this:

  4. -
  5. Lastly, Choose 2_realtime_inference.ipynb. When the notebook is open, on the Run menu, choose Run All Cells to run the code in this notebook. This notebook extracts the latest approved model from the model registry and deploys it as a realtime endpoint. It does so by running the following steps:

    +) + +while True: + resp = client.describe_pipeline_execution( + PipelineExecutionArn=response['PipelineExecutionArn'] + ) + status = resp['PipelineExecutionStatus'] +

    At the end of the batch transform pipeline, your pipeline structure on Amazon SageMaker Pipelines should look like this: Batch Transform Pipeline Structure

  6. +
  7. Lastly, Choose 2_realtime_inference.ipynb. When the notebook is open, on the Run menu, choose Run All Cells to run the code in this notebook. This notebook extracts the latest approved model from the model registry and deploys it as a SageMaker endpoint for real time inference. It does so by running the following steps:

      -
    • Extract the latest approved model from the SageMaker model registry: To deploy a real time SageMaker endpoint, we first will fetch the image uri to use and extract the latest approved model the same way we did in the prior batch transfrom notebook. Once you have extracted the latest approved model, use a container list with the specific inference.py file to create the model and run inferences against.
    • +
    • Extract the latest approved model from the SageMaker model registry: To deploy a real time SageMaker endpoint, we first will fetch the image uri to use and extract the latest approved model the same way we did in the prior batch transfrom notebook. Once you have extracted the latest approved model, use a container list with the specific inference.py file to create the model and run inferences against. This model creation and endpoint deployment is specific to the Scikit-learn model configuration and will change based on your use case.
    -
    container_list = [{
    -    'Image': image_uri,
    -    'ModelDataUrl': model_data_url,
    -    'Environment': {
    -        'SAGEMAKER_PROGRAM': 'inference.py',  
    -        'SAGEMAKER_SUBMIT_DIRECTORY': compressed_inference_script_uri, 
    -    }
    -}]
    -
    -## create the model object and call deploy on it
    -create_model_response = sm.create_model(
    -    ModelName = model_name,
    -    ExecutionRoleArn = role,
    -    Containers=container_list
    -)
    -

    In this code, we use the inference.py file specific to the Scikit Learn model. We then create our endpoint configuration, setting our ManagedInstanceScaling to ENABLED with our desired MaxInstanceCount and MinInstanceCount.

    -
    create_endpoint_config_response = sm.create_endpoint_config(
    -EndpointConfigName = endpoint_config_name,
    -ProductionVariants=[{
    -    'InstanceType': instance_type,
    -    ## have max instance count configured here
    -    'InitialInstanceCount': min_instances,
    -    'InitialVariantWeight': 1,
    -    'ModelName': model_name,
    -    'VariantName': 'AllTraffic', 
    -    ## change your managed instance configuration here
    -    "ManagedInstanceScaling":{
    -        "MaxInstanceCount": max_instances,
    -        "MinInstanceCount": min_instances,
    -        "Status": "ENABLED",}
    -}])
    +
    container_list = [{
    +    'Image': image_uri,
    +    'ModelDataUrl': model_data_url,
    +    'Environment': {
    +        'SAGEMAKER_PROGRAM': 'inference.py',  
    +        'SAGEMAKER_SUBMIT_DIRECTORY': compressed_inference_script_uri, 
    +    }
    +}]
    +
    +## create the model object and call deploy on it
    +create_model_response = sm.create_model(
    +    ModelName = model_name,
    +    ExecutionRoleArn = role,
    +    Containers=container_list
    +)
    +

    In this code, we use the inference.py file specific to the Scikit Learn model. We then create our endpoint configuration, setting our ManagedInstanceScaling to ENABLED with our desired MaxInstanceCount and MinInstanceCount for automatic scaling.

    +
    create_endpoint_config_response = sm.create_endpoint_config(
    +EndpointConfigName = endpoint_config_name,
    +ProductionVariants=[{
    +    'InstanceType': instance_type,
    +    ## have max instance count configured here
    +    'InitialInstanceCount': min_instances,
    +    'InitialVariantWeight': 1,
    +    'ModelName': model_name,
    +    'VariantName': 'AllTraffic', 
    +    ## change your managed instance configuration here
    +    "ManagedInstanceScaling":{
    +        "MaxInstanceCount": max_instances,
    +        "MinInstanceCount": min_instances,
    +        "Status": "ENABLED",}
    +}])
      -
    • Runs inferences for testing the real time deployed endpoint: Once you have extracted the latest approved model, created the model from the desired image uri and configured the endpoint configuration, you can deploy it as a real time SageMaker endpoint below:
    • +
    • Runs inferences for testing the real time deployed endpoint: Once you have extracted the latest approved model, created the model from the desired image uri and configured the endpoint configuration, you can then deploy it as a real time SageMaker endpoint below:
    -
    create_endpoint_response = sm.create_endpoint(
    -EndpointName=endpoint_name,
    -EndpointConfigName=endpoint_config_name)
    -logger.info(f"Going to deploy the real time endpoint -> {create_endpoint_response['EndpointArn']}")
    -
    -# wait for endpoint to reach a terminal state (InService) using describe endpoint
    -describe_endpoint_response = sm.describe_endpoint(EndpointName=endpoint_name)
    -
    -while describe_endpoint_response["EndpointStatus"] == "Creating":
    -    describe_endpoint_response = sm.describe_endpoint(EndpointName=endpoint_name)
    -    print(describe_endpoint_response["EndpointStatus"])
    -    time.sleep(15)
    +
    create_endpoint_response = sm.create_endpoint(
    +EndpointName=endpoint_name,
    +EndpointConfigName=endpoint_config_name)
    +logger.info(f"Going to deploy the real time endpoint -> {create_endpoint_response['EndpointArn']}")
    +
    +# wait for endpoint to reach a terminal state (InService) using describe endpoint
    +describe_endpoint_response = sm.describe_endpoint(EndpointName=endpoint_name)
    +
    +while describe_endpoint_response["EndpointStatus"] == "Creating":
    +    describe_endpoint_response = sm.describe_endpoint(EndpointName=endpoint_name)
    +    print(describe_endpoint_response["EndpointStatus"])
    +    time.sleep(15)

    Now run inference against the data extracted from prestoDB:

    -
    body_str = "total_extended_price,avg_discount,total_quantity\n1,2,3\n66.77,12,2"
    -
    -response = smr.invoke_endpoint(
    -    EndpointName=endpoint_name,
    -    Body=body_str.encode('utf-8') ,
    -    ContentType='text/csv',
    -)
    -
    -response_str = response["Body"].read().decode()
    -response_str
  8. +
    body_str = "total_extended_price,avg_discount,total_quantity\n1,2,3\n66.77,12,2"
    +
    +response = smr.invoke_endpoint(
    +    EndpointName=endpoint_name,
    +    Body=body_str.encode('utf-8') ,
    +    ContentType='text/csv',
    +)
    +
    +response_str = response["Body"].read().decode()
    +response_str
    +

    We have now seen an end to end process of our solution, from fetching data by connecting to a Presto Server on an EC2 instance, followed by training, evaluating, registering the model. We then approved the latest registered model from our training pipeline solution and ran batch inference against batch data stored in S3. We finally deployed the latest approved model as a real time SageMaker endpoint to run inferences against. Take a look at the results below.

Results

Here is a compilation of some queries and responses generated by our implementation from the real time endpoint deployment stage:

-

–> need to add results

-
-
-

Tip

-

–> need to add tips

+ + ++++ + + + + + + + + + + + + +
mlops-pipeline-prestodb results
QueryAnswer
total_extended_price,avg_discount,total_quantity,2,3,12,2– response –

Cleanup

diff --git a/blog_post.qmd b/blog_post.qmd index bcc5c8a..a1852aa 100644 --- a/blog_post.qmd +++ b/blog_post.qmd @@ -19,19 +19,19 @@ format: output-file: README.md --- -_Amit Arora_, _Madhur Prashant_, _Antara Raisa_ +_Amit Arora_, _Madhur Prashant_, _Antara Raisa_, _Johnny Chivers_ ***This post is co-written with customer_names from Twilio.*** -Machine learning (ML) models do not operate in isolation. To deliver value, they must integrate into existing production systems and infrastructure, which necessitates considering the entire ML lifecycle during design and development. ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Building a robust MLOps pipeline demands cross-functional collaboration. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance. With the right processes and tools, MLOps enables organizations to reliably and efficiently adopt ML across their teams. +Machine learning (ML) models do not operate in isolation. To deliver value, they must integrate into existing production systems and infrastructure, which necessitates considering the entire ML lifecycle during design and development. ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Building a robust MLOps pipeline demands cross-functional collaboration. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance. With the right processes and tools, MLOps enables organizations to reliably and efficiently adopt ML across their teams for their specific use cases. -[Amazon SageMaker MLOps](https://aws.amazon.com/sagemaker/mlops/?sagemaker-data-wrangler-whats-new.sort-by=item.additionalFields.postDateTime&sagemaker-data-wrangler-whats-new.sort-order=desc) is a suite of features that includes [Amazon SageMaker Projects](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects.html) (CI/CD), [Amazon SageMaker Pipelines](https://aws.amazon.com/sagemaker/pipelines/) and [Amazon SageMaker Model Registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html). +[Amazon SageMaker MLOps](https://aws.amazon.com/sagemaker/mlops/?sagemaker-data-wrangler-whats-new.sort-by=item.additionalFields.postDateTime&sagemaker-data-wrangler-whats-new.sort-order=desc) is a suite of features that includes [Amazon SageMaker Projects](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects.html) (CI/CD), [Amazon SageMaker Pipelines](https://aws.amazon.com/sagemaker/pipelines/) and [Amazon SageMaker Model Registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html). In this blog post, we will discuss SageMaker Pipelines and Model Registry. **SageMaker Pipelines** allows for straightforward creation and management of ML workflows, while also offering storage and reuse capabilities for workflow steps. The **SageMaker Model Registry** centralizes model tracking, simplifying model deployment. This blog post focuses on enabling AWS customers to have flexibility for using their data source of choice, and integrate it seamlessly with [Amazon SageMaker Processing jobs](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.html), where you can leverage a simplified, managed experience to run data pre- or post-processing and model evaluation workloads on the Amazon SageMaker platform. -[Twilio](https://pages.twilio.com/twilio-brand-sales-namer-1?utm_source=google&utm_medium=cpc&utm_term=twilio&utm_campaign=G_S_NAMER_Brand_Twilio_Tier1&cq_plac=&cq_net=g&cq_pos=&cq_med=&cq_plt=gp&gad_source=1&gclid=CjwKCAjwtqmwBhBVEiwAL-WAYd5PgxP-XSLDYBvu6y_j8KUydoj33QX3XWpUo4zEm2DLzgn_bfdogBoC9dIQAvD_BwE) needed to implement an MLOPs pipeline with their customer data stored within [PrestoDB](https://prestodb.io/). PrestoDB is an open-source SQL query engine that is designed for fast analytic queries against data of any size. +[Twilio](https://pages.twilio.com/twilio-brand-sales-namer-1?utm_source=google&utm_medium=cpc&utm_term=twilio&utm_campaign=G_S_NAMER_Brand_Twilio_Tier1&cq_plac=&cq_net=g&cq_pos=&cq_med=&cq_plt=gp&gad_source=1&gclid=CjwKCAjwtqmwBhBVEiwAL-WAYd5PgxP-XSLDYBvu6y_j8KUydoj33QX3XWpUo4zEm2DLzgn_bfdogBoC9dIQAvD_BwE) needed to implement an MLOPs pipeline and query data as a part of this process from [PrestoDB](https://prestodb.io/). PrestoDB is an open-source SQL query engine that is designed for fast analytic queries against data of any size from multiple sources. In this post, we show you a step-by-step implementation to achieve the following: @@ -39,63 +39,63 @@ In this post, we show you a step-by-step implementation to achieve the following - Train a binary classification model using SageMaker Training Jobs and tune the model using SageMaker Automatic Model Tuning -- Lastly, run a batch transform for inference on your raw data fetched from prestoDB and deploy the model as a real time SageMaker endpoint +- Run a batch transform for inference on your raw data fetched from prestoDB and deploy the model as a real time SageMaker endpoint for inference ## Use case overview -Twilio is an american cloud communications company, based in San Francisco, California and provides programmable communication tools for making and receiving phone calls, sending and receiving text messages, and performing other communication functions using its web service APIs. Being one of the largest strategic AWS customers, Twilio engages with Data and AI/ML servives to run their daily workloads. This blog resolves around the steps AWS and Twilio took to migrate Twilio's MLOps, implementation of training models and running batch inferences (that are able to detect burner accounts based on unusual user activity) to Amazon SageMaker. +Twilio is an american cloud communications company, based in San Francisco, California and provides programmable communication tools for making and receiving phone calls, sending and receiving text messages, and performing other communication functions using its web service APIs. Being one of the largest AWS customers, Twilio engages with Data and AI/ML servives to run their daily workloads. This blog resolves around the steps AWS and Twilio took to migrate Twilio's existing MLOps, implementation of training models and running batch inferences (that are able to detect burner accounts based on unusual user activity) to Amazon SageMaker. -Burners are phone numbers which are available online to everyone and are used to hide identities by creating fake accounts on our customers' apps/websites. Twilio built an effective data and machine learning operations pipeline to detect these anomaly phone numbers with the help of a binary classification model using the scikit-learn RandomForestClassifier. The training data they used for this pipeline is available via PrestoDB tables and is read into Pandas through the PrestoDB Python client. This data is then read into an Apache Spark dataframe (although the model training happens only using the data in the Pandas dataframe). +Burners are phone numbers which are available online to everyone and are used to hide identities by creating fake accounts on customers' apps/websites. Twilio built a data and machine learning operations pipeline to detect these anomaly phone numbers with the help of a binary classification model using the scikit-learn RandomForestClassifier. The training data they used for this pipeline is made available via PrestoDB tables and is read into Pandas through the [PrestoDB Python client](https://pypi.org/project/presto-python-client/). This data is then read into an Apache Spark dataframe for further analysis and machine learning operations. -The end goal was to convert all the existing steps into three sub solutions utilizing SageMaker Pipelines: a training pipeline, batch inference pipeline and finally, to deploy the trained model on an Amazon SageMaker Endpoint for real-time inference. +The end goal was to convert all the existing steps into a three fold solution utilizing SageMaker Pipelines to enable more frequent model re-training, optimized batch processing, while customers take advatage of flexibility of data access via an open-source SQL query engine: 1/Implement a training pipeline and 2/batch inference pipeline (by connecting a sagemaker processing job with data queried from Presto). 3/Finally, we also demonstrate deploying the trained model on a SageMaker Endpoint for real-time inference. -For the proof of concept, we used the [TPCH-Connector](https://prestodb.io/docs/current/connector/tpch.html) as our choice of open source data (this allows users to test Presto's capabilities and query syntax without needing to configure access to an external data source). Using this solution, Twilio successfully migrated to SageMaker pipelines with the open source solution that can be viewed here published on aws-samples github: [mlops-pipeline-prestodb](https://github.com/aws-samples/mlops-pipeline-prestodb?tab=readme-ov-file). +For the proof of concept, we used the [TPCH-Connector](https://prestodb.io/docs/current/connector/tpch.html) as our choice of open source data (this allows users to test Presto's capabilities and query syntax without needing to configure access to an external data source). Using this solution, Twilio successfully migrated to SageMaker pipelines with the open source solution that can be viewed here published on aws-samples github: [mlops-pipeline-prestodb](https://github.com/aws-samples/mlops-pipeline-prestodb?tab=readme-ov-file). ## Solution overview -The solution presented provides an implementation for training a machine learning model and running batch inference on Amazon SageMaker. The goal is to enable more frequent model re-training, optimized batch processing, and even add additional capabilities such as making the model available as a real-time endpoint. This solution also provides a design pattern built on AWS best practices that can be replicated for other ML workloads with minimal overhead. This is divided into three main steps: training pipeline, batch inference pipeline, and an implementation of real time inference support for the choice of maching learning model. +The solution presented provides an implementation for training a machine learning model and running batch inference on Amazon SageMaker using data fetched from a PrestoDB table. This solution provides a design pattern built on AWS best practices that can be replicated for other ML workloads with minimal overhead. This is divided into three main steps: training pipeline, batch inference pipeline, and an implementation of real time inference support for the choice of maching learning model. -We have divided the solution into the following main components that are open sourced and can be run through simple [config file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml) updates. +This solution is now open source and can be run through simple config file updates. For more information on the ***config.yml*** file walkthrough, view [this link](./config.yml) (add a link here poiting to the config file). -This solution includes the following components: +This solution includes the following steps: -- [Model Training Pipeline](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/0_model_training_pipeline.ipynb): In this step, we train and tune the ML model and register it with the [SageMaker model registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html). All the steps in this notebook are executed as part of a training pipeline. This notebook also contains an automatic model approval step that changes the state of the model registered with the model registry from PendingForApproval to Approved state. This step can be removed for production ready accounts where a human in the loop or some criteria based approval would be required. -- [Batch Transform Pipeline](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/1_batch_transform_pipeline.ipynb): This notebook is used to launch the batch inference pipeline that reads data from PrestoDB and runs batch inference on it using the most recent Approved ML model. This model is approved in the prior step either programmatically or manually via the SageMaker model registry. -- [Realtime Inference](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb): This notebook is used to deploy the latest approved model as a SageMaker endpoint for real-time inference. +- [Model Training Pipeline](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/0_model_training_pipeline.ipynb): In this step, we connect a sagemaker processing job to data fetched from a Presto server that runs on an Amazon EC2 instance, train and tune the ML model and register it with the [SageMaker model registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html). All the steps in this notebook are executed as part of a training pipeline. +- [Batch Transform Pipeline](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/1_batch_transform_pipeline.ipynb): This notebook is used to perform an automatic model approval step that changes the state of the model registered with the model registry from PendingForApproval to Approved state. This step can be removed for production ready accounts where a human in the loop or some criteria based approval would be required. After this, we launch the batch inference pipeline that reads data from PrestoDB and runs batch inference on it using the most recent Approved ML model. This model is approved either programmatically or manually via the SageMaker model registry. +- [Realtime Inference](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb): This notebook is used to deploy the latest approved model as a SageMaker endpoint for [real-time inference](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html). ## Solution design -The solution design consists of three parts - Setting up the data preparation and training pipeline, preparing for the batch transform step, and deploying the approved model of choice as a real time SageMaker endpoint for inference. All of these parts utilize information from a single [config.yml file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml), which includes the AWS and Presto credential information, Individial step [pipeline parameters](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-parameters.html) for the data preprocessing, training, tuning, model evaluation, model registeration and real time endpoint steps of this solution. This configuration file is highly customizable for the user to use and run the solution end to end with minimal-no code changes. +The solution design consists of the following parts - Setting up the data preparation and training pipeline, preparing for the batch transform step, and deploying the approved model of choice as a real time SageMaker endpoint for inference. All of these parts utilize information from a single [config.yml file](./config.yml), which includes the necessary AWS and Presto credential information to connect to a presto server on an EC2 instance, Individial step [pipeline parameters](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-parameters.html) for the data preprocessing, training, tuning, model evaluation, model registeration and real time endpoint steps of this solution. This configuration file is highly customizable for the user to use and run the solution end to end with minimal-no code changes. -The main components of this solution are as describe below: +The main components of this solution are as described in detail below: -### [Data Preparation and Training Pipeline Step](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/0_model_training_pipeline.ipynb): +### Part 1 - [Data Preparation and Training Pipeline Step](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/0_model_training_pipeline.ipynb): -1. The training data is read from PrestoDB and any feature engineering needed is done as part of the SQL queries run in PrestoDB at retrieval time. -1. We use a FrameworkProcessorwith SageMaker Processing Jobs to read data from PrestoDB using the Python PrestoDB client. The existing code for reading data from PrestoDB include the queries used remains unchanged from the current implementation. -1. For the training and tuning step, we use the [SKLearn estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html) from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The [HyperparameterTunerclass](https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html) is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance (maximize the AUC metric). +1. The training data is read from a PrestoDB server started on an EC2 instance, and any feature engineering needed is done as part of the SQL queries run in PrestoDB at retrieval time. The queries used to fetch data at the training and batch inference step can be configured in the [config file here](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml). +1. We use a [FrameworkProcessor](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job-frameworks.html) with SageMaker Processing Jobs to read data from PrestoDB using the Python PrestoDB client. +1. For the training and tuning step, we use the [SKLearn estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html) from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The [HyperparameterTunerclass](https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html) is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance for a given use case (for example, maximize the [AUC metric](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)). 1. The [model evaluation](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-pipelines/tabular/abalone_build_train_deploy/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.html) step is to check that the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently approved and deployed). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry. 1. The model training pipeline is then run with the [`Pipeline.start`](https://docs.aws.amazon.com/sagemaker/latest/dg/run-pipeline.html) which triggers and instantiates all steps above. -### [Batch Transform Step](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/1_batch_transform_pipeline.ipynb): +### Part 2 - [Batch Transform Step](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/1_batch_transform_pipeline.ipynb): -1. The batch inference pipeline consists of two steps: a data preparation step that retrieves the data from the PrestoDB and stores it in S3 (same implementation as in the training pipeline mentioned above) and a batch transform step that runs inference on this data stored in S3 and stores the output data in S3. +1. The batch inference pipeline consists of two steps: a data preparation step that retrieves the data from PrestoDB (using a [batch data preprocess script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/presto_preprocess_for_batch_inference.py) that connects and fetches data from the presto server deployed on EC2) and stores it in S3 (same implementation as in the training pipeline mentioned above). After this, a batch transform step runs inference on this data stored in S3 and stores the output data in S3. 1. In this step, we utilize the transformer instance and the TransformInput with the batch_data pipeline parameter defined. -### [Real Time SageMaker endpoint support](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb): +### Part 3 - [Real Time SageMaker endpoint support](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb): 1. The latest approved model from the model registry is deployed as a realtime endpoint. -1. The latest approved model is retrieved from the registry using the describe_model_package function from the SageMaker SDK. -2. The model is deployed on a ml.c5.xlarge instances with a minimum instance count of 1 and maximum instance count of 3 (configurable by the user) and automatic scaling policy configured. +1. The latest approved model is retrieved from the registry using the [describe_model_package](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker/client/describe_model_package.html) function from the SageMaker SDK. +2. The model is deployed on a `ml.c5.xlarge` instance with a minimum instance count of 1 and maximum instance count of 3 (configurable by the user) and [automatic scaling policy](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling.html) configured. ## Prerequisites -To implement the solution provided in this post, you should have an [AWS account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup) and familarity with SageMaker, Amazon S3, and PrestoDB. +To implement the solution provided in this post, you should have an [AWS account](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&client_id=signup) and familarity with SageMaker, S3, and PrestoDB. The following prerequisites need to be in place before running this code. #### PrestoDB -We will use the built-in datasets available in PrestoDB for this repo. Following the instructions below to setup PrestoDB on an Amazon EC2 instance in your account. ***If you already have access to a PrestoDB instance then you can skip this section but keep its connection details handy (see the `presto` section in the [`config`](./config.yml) file)***. +We will use the built-in datasets available in PrestoDB for this repo. Follow the instructions below to setup PrestoDB on an Amazon EC2 instance in your account. ***If you already have access to a PrestoDB instance then you can skip this section but keep its connection details handy (see the `presto` section in the [`config`](./config.yml) file)***. 1. Create a security group to limit access to Presto. Create a security group called **MyPrestoSG** with two inbound rules to only allow access to Presto. - Create the first rule to allow inbound traffic on port 8080 to Anywhere-IPv4 @@ -229,16 +229,18 @@ Setup a secret in Secrets Manager for the PrestoDB username and password. Call t ## Testing the solution -Once the prerequisite steps are complete and the config.yml file is set up correctly, we are now ready to run the "mlops-pipeline-prestodb" implementation: +Once the prerequisites and set up is complete and the config.yml file is set up correctly, we are now ready to run the `mlops-pipeline-prestodb` implementation. Follow the steps below or access the [github repository](https://github.com/aws-samples/mlops-pipeline-prestodb/tree/main) to walk through the solution: 1. On the SageMaker console, or your IDE of choice, choose **0_model_training_pipeline.inpynb** in the navigation pane. When the notebook is open, on the Run menu, choose **Run All Cells** to run the code in this notebook. This notebook demonstrates how SageMaker Pipelines can be used to string together a sequence of data processing, model training, tuning and evaluation step to train a binary classification machine learning model using scikit-learn. The trained model can then be used for batch inference, or hosted on a SageMaker endpoint for realtime inference. - - **Preprocess data step**: In this step of the notebook, we set our pipeline input parameters when triggering our pipeline execution. We use a [preprocess script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/presto_preprocess_for_training.py) which is read to connect to presto and query data, that is then sent to an Amazon S3 bucket split into train, test and validation datasets. Using these files, this step can then use the data for training the model. + - **Preprocess data step**: In this step of the notebook, we set our pipeline input parameters when triggering our pipeline execution. We use a [preprocess script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/presto_preprocess_for_training.py) which is read to connect to the presto server on our EC2 instance, and query data (using the query specified and configurable in the [config file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml)), that is then sent to an S3 bucket split into train, test and validation datasets. Using the data in these files, we can train our machine learning model. - - We use the sklearn_processor in a SageMaker Pipelines ProcessingStep and define it as given below: + - We use the [sklearn_processor](https://docs.aws.amazon.com/sagemaker/latest/dg/use-scikit-learn-processing-container.html) in a SageMaker Pipelines ProcessingStep and define it as given below: ``` {.python} + # declare the sk_learn processer step_args = sklearn_processor.run( + ## code refers to the data preprocessing script that is responsible for querying data from the presto server code=config['scripts']['preprocess_data'], source_dir=config['scripts']['source_dir'], outputs=outputs_preprocessor, @@ -260,7 +262,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - - We are using the `config['scripts']['source_dir']` which refers to our data preprocessing script that connects to the EC2 instance where the presto server runs. This script is responsible for extracting data from the query that you define. You can query the data you want by modifying the query parameter in the [config file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml) query parameter. We are using the sample query as an example to extract open source `TPCH data` on orders, discounts and order priorities. + - We are using the `config['scripts']['source_dir']` which refers to our data preprocessing script that connects to the EC2 instance where the presto server runs. This script is responsible for extracting data from the query that you define. You can query the data you want by modifying the query parameter in the ***config.yml*** file query parameter. We are using the sample query as an example to extract open source `TPCH data` on orders, discounts and order priorities. ``` {.sql} SELECT @@ -288,13 +290,14 @@ Once the prerequisite steps are complete and the config.yml file is set up corre LIMIT 5000 ``` - - **Train Model Step**: In this step of the notebook, we use the SKLearn estimator from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The HyperparameterTunerclass is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance (maximize the AUC metric). We use the train, test and validation files that are sent to an S3 bucket after querying it from PrestoDB. + - **Train Model Step**: In this step of the notebook, we use the [SKLearn estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html) from SageMaker SDK and the RandomForestClassifier from scikit-learn to train the ML model. The [HyperparameterTunerclass](https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html) is used for running automatic model tuning to determine the set of hyperparameters that provide the best performance based on a given metric threshold (maximize the AUC metric). - - In the code below, the `sklearn_estimator` object is created with parameters that are configured in the [config file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml) and uses this [training script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/training.py) to train the ML model. The hyperparameters are also configurable by the user via the config file. + - In the code below, the `sklearn_estimator` object is created with parameters that are configured in the [config file](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/config.yml) and uses this [training script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/training.py) to train the ML model. The hyperparameters are also configurable by the user via the config file. This step accesses the train, test and validation files that are created as a part of the previous data preprocessing step. ``` {.python} sklearn_estimator = SKLearn( + # we configure the training script that accesses the train, test and validation files from the data preprocessing step entry_point=config['scripts']['training_script'], role=role, instance_count=config['training_step']['instance_count'], @@ -302,6 +305,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre framework_version=config['training_step']['sklearn_framework_version'], base_job_name=config['training_step']['base_job_name'], hyperparameters={ + # Hyperparameters are fetched and are configured in the config.yml file "n_estimators": config['training_step']['n_estimators'], "max_depth": config['training_step']['max_depth'], "features": config['training_step']['training_features'], @@ -323,7 +327,8 @@ Once the prerequisite steps are complete and the config.yml file is set up corre metric_definitions=config['tuning_step']['metric_definitions'], max_parallel_jobs=config['tuning_step']['maximum_parallel_training_jobs'], ## reducing this for testing purposes ) - + + # declare a tuning step to use the train and test data to tune the ML model using the `HyperparameterTuner` declared above step_tuning = TuningStep( name=config['tuning_step']['step_name'], tuner=rf_tuner, @@ -342,11 +347,9 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - - - - **Evaluate model step**: The purpose of this step is to check that the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently approved and deployed). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry. We use the [`ScriptProcessor`](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-container-run-scripts.html) with an [evaluation script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/evaluate.py) that a user creates to evaluate the trained model based on metrics of choice. + - **Evaluate model step**: The purpose of this step is to check if the trained and tuned model has an accuracy level above a configurable threshold and only then register the model with the model registry (from where it can be subsequently [approved and deployed](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-approve.html)). If the model accuracy does not meet a configured threshold then the pipeline fails and the model is not registered with the model registry. We use the [`ScriptProcessor`](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-container-run-scripts.html) with an [evaluation script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/evaluate.py) that a user creates to evaluate the trained model based on a metric of choice. - - Once this step is run, an `Evaluation Report` is generated that is sent to the S3 bucket: + - Once this step is run, an `Evaluation Report` is generated that is sent to the S3 bucket for analysis: ``` {.python} @@ -357,7 +360,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ``` - - The evaluation step uses the [evaluation script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/evaluate.py) as a code entry in the step below: + - The evaluation step uses the [evaluation script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/evaluate.py) as a code entry in the step below. This script prepares the features and target values and calculates teh prediciton probabilities using `model.predict`. The evaluation report sent to S3 contains information on metrics like `precision, recall, accuracy`. ``` {.python} step_evaluate_model = ProcessingStep( @@ -399,7 +402,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - - **Register model step**: Once the trained model meets the model performance requirements, a new version of the model is registered with the [model registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html) for further analysis. + - **Register model step**: Once the trained model meets the model performance requirements, a new version of the model is registered with the [model registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html) for further analysis and model creation. ``` {.python} # Crete a RegisterModel step, which registers the model with SageMaker Model Registry. @@ -419,7 +422,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ``` ***The model is registered with the model Registry with approval status set to PendingManualApproval, this means the model cannot be deployed on a SageMaker Endpoint unless its status in the registry is changed to Approved manually via the SageMaker console, programmatically or through a Lambda function.*** - Adding conditions to the pipeline is done with a ConditionStep. In this case, we only want to register the new model version with the model registry if the new model meets a specific accuracy condition: + Adding conditions to the pipeline is done with a [ConditionStep](https://sagemaker.readthedocs.io/en/stable/workflows/pipelines/sagemaker.workflow.pipelines.html). In this case, we only want to register the new model version with the model registry if the new model meets a specific accuracy condition: ``` {.python} @@ -445,12 +448,14 @@ Once the prerequisite steps are complete and the config.yml file is set up corre name=config['condition_step']['step_name'], conditions=[cond_gte], if_steps=[step_register_model], - else_steps=[step_fail], ## if this fails - add a step here (from the quip) + else_steps=[step_fail], ## if this fails ) ``` + + If the accuracy condition is not met, a `step_fail` step is executed that sends an error message to the user and the pipeline fails. - - **Orchestrate all steps and start the pipeline**: Once you have created the pipeline steps as above, you can instantiate and start it with custom parameters making the pipeline agnostic to who is triggering it, but also to the scripts and data used. The pipeline can be started using the CLI, the SageMaker Studio UI or the SDK and below there is a screenshot of what it looks like in the SageMaker Studio UI. + - **Orchestrate all steps and start the pipeline**: Once you have created the pipeline steps as above, you can instantiate and start it with custom parameters making the pipeline agnostic to who is triggering it, but also to the scripts and data used. The pipeline can be started using the CLI, the SageMaker Studio UI or the SDK. ``` {.python} # Start pipeline with credit data and preprocessing script @@ -474,14 +479,14 @@ Once the prerequisite steps are complete and the config.yml file is set up corre print_pipeline_execution_summary(execution.list_steps(), pipeline.name) ``` - **At the end of the training pipeline, your pipeline structure on Amazon SageMaker should look like this:** - + **At the end of the executing the entire training pipeline, your pipeline structure on [Amazon SageMaker Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-sdk.html) should look like this:** + ![Training Pipeline Structure](images/training_pipeline.png){#fig-open-jl} - ***Now that the model is registered, get access to the registered model manually on the sagemaker studio model registry console, or programmatically in the next notebook, approve it and run the second portion of this solution: Batch Transform Step*** + ***Now that the model is registered, you can get access to the registered model manually on the sagemaker studio model registry console, or programmatically in the next notebook, approve it and run the second portion of this solution: Batch Transform Step*** 1. Next Choose [`1_batch_transform_pipeline.ipynb`](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/1_batch_transform_pipeline.ipynb). When the notebook is open, on the Run menu, choose **Run All Cells** to run the code in this notebook. This notebook will run a batch transform step using the model trained in the previous notebook. It does so by running the following steps: - - **Extract the latest approved model from the SageMaker model registry**: In this step, we first define pipeline input parameters that are used for the ***EC2 instance types to use for processing and training*** + - **Extract the latest approved model from the SageMaker model registry**: In this step, we first define pipeline input parameters that are used for the ***EC2 instance types to use for processing and training steps***. These parameters can be configured on the [config.yml](./config.yml) file. ``` {.python} # What instance type to use for processing. @@ -533,7 +538,10 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - - **Read raw data for inference from PrestoDB and stores in an Amazon S3 bucket**: Once the latest model is approved, we get the latest batch data from presto and use that for our batch transform step. In this step, we use another [batch preprocess script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/presto_preprocess_for_batch_inference.py) that is dedicated to reading and fetching data from presto and saving in a batch directory. + Now we have extracted the latest model from the SageMaker Model Registry, and programmatically approved it. You can also approve the model manually on the [SageMaker Model Registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html) page on SageMaker Studio as given in the screenshot below. + ![SageMaker Model Registry: Manual Model Approval via SageMaker Studio](images/sagemaker_model_registry.png){#fig-open-jl} + + - **Read raw data for inference from PrestoDB and store in an Amazon S3 bucket**: Once the latest model is approved, we get the latest batch data from presto and use that for our batch transform step. In this step, we use another [batch preprocess script](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/presto_preprocess_for_batch_inference.py) that is dedicated to reading and fetching data from presto and saving in a batch directory within our S3 bucket. ``` {.python} ## represents the output processing for the batch pre processing step @@ -556,6 +564,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre # Use the sklearn_processor's run method and configure the batch preprocessing step step_args = sklearn_processor.run( + # here, we add in a `code` or an entry point that uses the data preprocess script for collecting data in a batch and storing it in S3 code=config['scripts']['batch_transform_get_data'], source_dir=config['scripts']['source_dir'], outputs=batch_output, @@ -576,7 +585,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - Now, with the image uri, we refer to the ['inference.py'](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/inference.py) script that grabs information on features to use while making predictions. Using this, we will create the model which automatically trigger the training and the preprocess data step Run the transformer step on the created model. + Once the batch data preparation step is complete, we declare a model with the image uri and refer to the ['inference.py'](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/code/inference.py) script that grabs information on features to use while making predictions. Using this, we will create the model which automatically trigger the training and the preprocess data step Run the transformer step on the created model. ``` {.python} # create the model image based on the model data and refer to the inference script as an entry point for batch inference @@ -636,11 +645,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre response = sagemaker_client.start_pipeline_execution( PipelineName=batch_transform_pipeline.name ) - ``` - - Once the pipeline run has completed, run batch inference and view the output below: - - ``` {.python} + while True: resp = client.describe_pipeline_execution( PipelineExecutionArn=response['PipelineExecutionArn'] @@ -648,13 +653,12 @@ Once the prerequisite steps are complete and the config.yml file is set up corre status = resp['PipelineExecutionStatus'] ``` - **At the end of the batch transform pipeline, your pipeline structure on Amazon SageMaker should look like this:** - - + **At the end of the batch transform pipeline, your pipeline structure on Amazon SageMaker Pipelines should look like this:** + ![Batch Transform Pipeline Structure](images/batch_transform_pipeline.png){#fig-open-jl} -1. Lastly, Choose [`2_realtime_inference.ipynb`](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb). When the notebook is open, on the Run menu, choose **Run All Cells** to run the code in this notebook. This notebook extracts the latest approved model from the model registry and deploys it as a realtime endpoint. It does so by running the following steps: +1. Lastly, Choose [`2_realtime_inference.ipynb`](https://github.com/aws-samples/mlops-pipeline-prestodb/blob/main/2_realtime_inference.ipynb). When the notebook is open, on the Run menu, choose **Run All Cells** to run the code in this notebook. This notebook extracts the latest approved model from the model registry and deploys it as a SageMaker endpoint for real time inference. It does so by running the following steps: - - **Extract the latest approved model from the SageMaker model registry**: To deploy a real time SageMaker endpoint, we first will fetch the `image uri` to use and extract the latest approved model the same way we did in the prior batch transfrom notebook. Once you have extracted the latest approved model, use a container list with the specific `inference.py` file to create the model and run inferences against. + - **Extract the latest approved model from the SageMaker model registry**: To deploy a real time SageMaker endpoint, we first will fetch the `image uri` to use and extract the latest approved model the same way we did in the prior batch transfrom notebook. Once you have extracted the latest approved model, use a container list with the specific `inference.py` file to create the model and run inferences against. This model creation and endpoint deployment is specific to the [Scikit-learn model](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html) configuration and will change based on your use case. ``` {.python} container_list = [{ @@ -674,7 +678,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre ) ``` - In this code, we use the inference.py file specific to the Scikit Learn model. We then create our endpoint configuration, setting our `ManagedInstanceScaling` to `ENABLED` with our desired `MaxInstanceCount` and `MinInstanceCount`. + In this code, we use the inference.py file specific to the Scikit Learn model. We then create our endpoint configuration, setting our `ManagedInstanceScaling` to `ENABLED` with our desired `MaxInstanceCount` and `MinInstanceCount` for automatic scaling. ``` {.python} create_endpoint_config_response = sm.create_endpoint_config( @@ -694,7 +698,7 @@ Once the prerequisite steps are complete and the config.yml file is set up corre }]) ``` - - **Runs inferences for testing the real time deployed endpoint**: Once you have extracted the latest approved model, created the model from the desired image uri and configured the endpoint configuration, you can deploy it as a real time SageMaker endpoint below: + - **Runs inferences for testing the real time deployed endpoint**: Once you have extracted the latest approved model, created the model from the desired image uri and configured the endpoint configuration, you can then deploy it as a real time SageMaker endpoint below: ``` {.python} create_endpoint_response = sm.create_endpoint( @@ -726,6 +730,8 @@ Once the prerequisite steps are complete and the config.yml file is set up corre response_str ``` + We have now seen an end to end process of our solution, from fetching data by connecting to a Presto Server on an EC2 instance, followed by training, evaluating, registering the model. We then approved the latest registered model from our training pipeline solution and ran batch inference against batch data stored in S3. We finally deployed the latest approved model as a real time SageMaker endpoint to run inferences against. Take a look at the results below. + ## Results Here is a compilation of some queries and responses generated by our implementation from the real time endpoint deployment stage: diff --git a/images/sagemaker_model_registry.png b/images/sagemaker_model_registry.png new file mode 100644 index 0000000..c7d1526 Binary files /dev/null and b/images/sagemaker_model_registry.png differ