Skip to content
This repository has been archived by the owner on Apr 11, 2019. It is now read-only.

Latest commit

 

History

History
155 lines (89 loc) · 8.12 KB

File metadata and controls

155 lines (89 loc) · 8.12 KB

WARNING: This repository is no longer maintained ⚠️

This repository will not be updated. The repository will be kept available in read-only mode. Refer to https://github.com/IBM/monitor-wml-model-with-watson-openscale for a similar example.

Prediction Using Watson Machine Learning

DISCLAIMER: This notebook is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review. It is not intended to serve as a medical application. There is no representation as to the accuracy of the output of this application and it is presented without warranty.

In this Code Pattern, we will use anonymous patient data to predict the best drug to treat heart disease. This notebook introduces commands for getting data, model persistance to Watson Machine Learning repository, model deployment, and scoring.

When the reader has completed this Code Pattern, they will understand how to:

  • Prepare data, create an Apache Spark machine learning pipeline, and train a model.
  • Publish a sample model in the Watson Machine Learning (WML) repository.
  • Deploy a model for online scoring.
  • Score the model using sample scoring records and the scoring endpoint.

architecture

Flow

  1. User creates a project in Watson Studio using a Jupyter notebook, Python 3.5, and Spark.
  2. User uses Db2 Warehouse in the Cloud to load and read data.
  3. User uses PySpark to create a pipeline, train a model, and store the model using Watson Machine Learning.

Watch the Video

video

Prerequisites

Steps

  1. Clone the repository
  2. Create Watson services in IBM Cloud
  3. Save the credentials for your Watson Machine Learning Service
  4. Create the DB2 Warehouse on Cloud Service and load data
  5. Create a notebook in IBM Watson Studio
  6. Run the notebook in IBM Watson Studio

1. Clone the repository

$ git clone https://github.com/IBM/prediction-using-watson-machine-learning
$ cd prediction-using-watson-machine-learning

2. Create Watson services in IBM Cloud

  • Create a new project by clicking + New project and choosing Data Science:

project choices

Note: Services created must be in the same region, and space, as your Watson Studio service. Note: If this is your first project in Watson Studio, an object storage instance will be created.

  • Under the Settings tab, scroll down to Associated services, click + Add service and choose Watson:

add service

  • Search for Machine Learning, Verify this service is being created in the same space as the app in Step 1, and click Create.

    create machine learning

  • Alternately, you can choose an existing Machine Learning instance and click on Select.

    add existing ML

  • The Watson Machine Learning service is now listed as one of your Associated Services.

  • Click on the Settings tab for the project, scroll down to Associated services and click + Add service -> Spark.

  • Either choose an Existing Spark service, or create a New one.

    add existing Spark

    add new Spark

3. Save the credentials for your Watson Machine Learning Service

  • In a different browser tab go to https://cloud.ibm.com/ and log in to the Dashboard.

  • Click on your Watson Machine Learning instance under Services, click on Service credentials and then on View credentials to see the credentials.

    ML credentials

  • Save the username, password and instance_id to a text file on your machine. You’ll need this information later in your Jupyter notebook.

4. Create the DB2 Warehouse on Cloud Service and load data

  • Create a Db2 Warehouse on Cloud Service instance (an entry plan is offered).

  • Get the authentication information for DB2, which can be found under the Service Credentials tab of the Db2 Warehouse on Cloud service instance created in IBM Cloud. Click New credential to create credentials if you do not have any.

  • Create the DRUG_TRAIN_DATA_UPDATED table in Db2 Warehouse on Cloud. You will use the drug_train_data_updated.csv file from this git repository.

  • Click the Open icon to open the console.

DB2 Cloud console

  • Use the username and password from the service credentials to log in.

  • Click the Load Data icon.

DB2 Cloud load data

  • Drag and drop or browse to the data/drug_train_data_updated.csv file and press Next.

  • Under Select a load target -> Schema pick the username for your credentials and click it.

  • Under Table click New Table.

  • Write the name DRUGTRAINDATA in Create a new Table -> New Table Name and click Create. Click Next to finish data import.

  • Use ; as field separator.

DB2 chose separator

  • Click Next to create a table with the uploaded data.

5. Create a notebook in IBM Watson Studio

6. Run the notebook in IBM Watson Studio

  • Enter your DB2 Warehouse credentials in the cell after 2.1 Load the training data from Db2 Warehouse on Cloud.

  • Enter your Watson Machine Learning credentials in the cell after Action: Enter your Watson Machine Learning service instance credentials here. .

  • Move your cursor to each code cell and run the code in it. Read the comments for each cell to understand what the code is doing. Important when the code in a cell is still running, the label to the left changes to In [*]:. Do not continue to the next cell until the code is finished running.

Sample Output

sample output

sample output

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ