diff --git a/1- Intro to Machine Learning Decision Trees.ipynb b/1- Intro to Machine Learning Decision Trees.ipynb
deleted file mode 100644
index 38a1c28..0000000
--- a/1- Intro to Machine Learning Decision Trees.ipynb
+++ /dev/null
@@ -1,1032 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "85be35e8-c134-4ba4-ac3d-fe95bc106ff4",
- "metadata": {
- "tags": []
- },
- "source": [
- "# **WELCOME to Introduction to Machine Learning: Decision Trees!**\n",
- "\n",
- "This notebook was created by Lucy Moctezuma Tan, Florentine van Nouhuijs, Lorena Benitez-Rivera (SFSU master's students and CoDE lab members), and Pleuni Pennings (SFSU bio professor).\n",
- "Special acknowledgment to Faye Orcales for pulling the COVID data tables from government websites.\n",
- "\n",
- "**Data sources:**\n",
- "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
- "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
- "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
- "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
- "\n",
- "# **Introduction Video:**\n",
- "\n",
- "It's important to note that there are other machine learning techniques, but the aim of this notebook will be to have a basic understanding of one of the fundamental techniques used: Decision Tree. This is ideal because Decision Trees are the basis for more complex models such as Boosted Trees or Random Forests. Below we have a general introduction video to machine learning by Lorena Benitez."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3ed9d402-5f72-448f-9af1-dd2b30a7d970",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('e3tGQykFC5M', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c8edf990-07a7-4b8f-9344-3cee67bc389f",
- "metadata": {},
- "source": [
- "# OBJECTIVE OF THIS EXERCISE:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "9bb55afa-f5e8-43b8-9c9f-2fa8ffa3b233",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('_kAjJ8rJwfU', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "70bbc537-63d0-44f9-8705-0f0fb3dc494d",
- "metadata": {},
- "source": [
- "\n",
- "We are going to be working with **COVID data** from the 58 counties of California during Summer 2020 (July, August, and September). \n",
- "\n",
- "**Remember the complete dataset with 58 counties from the previous video of this workshop?** \n",
- "\n",
- "Let's now imagine that we did not know the **cases per 100,000 people** for the last 18 counties of the dataset. \n",
- "\n",
- "![Features-for-Prediction.jpg](images/Features-for-Prediction.jpg)\n",
- "\n",
- "**The objective** of this exercise will be to make predictions for these missing values in the column **cases per 100,000 people** based solely on the data that we do have available.\n",
- "\n",
- "The information that we still have available for these 18 counties are:\n",
- "\n",
- "* Population\n",
- "* Vaccination Percentage (Partial and Fully vaccinated)\n",
- "* Unemployment Rates\n",
- "* Partisan Voting Percentage (Democrat, Green, Republican, Libertarian, and Other)\n",
- "\n",
- "In order to do this, we will be creating a **DECISION TREE**"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "be416a6d-441d-4f13-a4f3-4d3600ce3df4",
- "metadata": {},
- "source": [
- "# WHAT IS A DECISION TREE?\n",
- "\n",
- " A **Decision Tree** is a supervised machine learning model that allows us to make predictions by learning simple decision rules that are inferred using available information in the dataset. \n",
- " \n",
- "- A Decision Tree is called a **supervised** model because we know exactly what we want to figure out. For example, for our Decision Tree, we will specify that we want to figure out the missing values of the column **cases per 100,000 people**, and our model will try to find these values by making predictions for them using the the information we do have available.\n",
- "\n",
- "- In contrast, in an **unsupervised** model, we do not know exactly what we want to predict. Instead, an unsupervised model finds hidden relationships between different types of information and can group them based on similarities. For example, Netflix surprising you with a new show you like.\n",
- "\n",
- "A **Decision Tree** can be pictured as a tree-like flowchart, where we start with a particular criteria and based on whether this is True (Y for Yes) or False (N for No), we chose only one of the branches. This process is then repeated at every decision until we reach the bottom of the tree, where we end up with a specific prediction. \n",
- "\n",
- "![General-Decision-Tree.png](images/General-Decision-Tree.png)\n",
- "\n",
- "We will see how a Decision Tree can help us predict the missing **cases per 100,000 people** in more detail later on in this tutorial.\n",
- "\n",
- "You can find more information about different ways to classify machine learning models here: [Machine Learning Models](https://www.geeksforgeeks.org/introduction-machine-learning/?ref=lbp)\n",
- "\n",
- "You can find more information about Decision Trees here: [Scikit-learn](https://scikit-learn.org/stable/modules/tree.html)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "fbc617cd-9aac-463d-990d-a2e7d6ae86e4",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "!pip install jupyterquiz==2.0.7 --quiet\n",
- "from jupyterquiz import display_quiz\n",
- "\n",
- "display_quiz('quiz_files/quiz1.json')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c7d3fb42-e3f7-4b3b-a14e-c100c4382491",
- "metadata": {
- "tags": []
- },
- "source": [
- "## **Step 1) Importing necessary packages into the notebook**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2b05d4b9-e1f5-4892-867b-35b81b717aa1",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('jPIQbpdTkbM', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d9580610-6745-44a1-99ca-2e71d88b3910",
- "metadata": {},
- "source": [
- "\n",
- "Before working on our model we need to import all packages and specific functions that we will need to use in order to work with our data. \n",
- "\n",
- "- **Packages** are essentially prepackaged code that others have made, that are often organized in chunks of code called modules. A package can contain many modules and these modules may contain several functions. \n",
- "\n",
- "- **Functions** are essentially a set of instructions to a computer that specify how to handle different types of files, what mathematical equations are used to calculate our model, how our graphs are going to be displayed, etc. \n",
- "\n",
- "The code in this notebook is organized in **cells**\n",
- "\n",
- "In the example below we will learn how to execute or \"run\" each of the three cells, so that our code actually takes effect. To run the code in a cell, select the cell and press the \"play\" button on the upper part of the notebook menu. \n",
- "\n",
- "**Note**: The lines of green text that are preceded by a \"#\" are called comments, they exist only to provide explanations of what each line or chunk of code does. They are not actually part of the code."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "5ddada1e-aae9-4b22-87ae-022e0b96a053",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Data Wrangling Imports\n",
- "import pandas as pd\n",
- "import numpy as np"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "a24d2226-35ca-451b-85d4-757c93eab78a",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Machine Learning Models Imports\n",
- "from sklearn import tree\n",
- "from sklearn.tree import DecisionTreeRegressor "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "6b29cad9-0cb2-40c8-959a-177b86019a3f",
- "metadata": {
- "scrolled": true,
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Model Evaluation Imports and Visualization\n",
- "from matplotlib import pyplot as plt\n",
- "!pip install graphviz\n",
- "!conda install -c anaconda graphviz -y\n",
- "import graphviz\n",
- "# Quantitative metrics of Model performance\n",
- "from sklearn.metrics import mean_squared_error"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "ba92c612-e851-4c76-8551-4bf12daea579",
- "metadata": {},
- "source": [
- "## **Step 2) Loading training data and making sure it looks correct**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4f7561be-fa34-479e-95d1-62ef4bcdfc29",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('z9dcLYg65uk', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "002bea86-95cb-4076-b05a-aa34bcbceb15",
- "metadata": {},
- "source": [
- "Now that we have our tools, we can now examine our dataset again. \n",
- "\n",
- "Recall that we are missing the last 18 values in the column \"cases per 100,000\", but we still have a big chunk of complete data (40 rows). This chunk of complete information is often referred to as **training data**.\n",
- "\n",
- "![Training-Data.jpg](images/Training-Data.jpg)\n",
- "\n",
- "**Training data** is a machine learning term that refers to the dataset used to teach our Decision Tree to make the predictions for our missing values using available data."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "874fc63a-d033-410b-a080-adce8dd7185c",
- "metadata": {},
- "source": [
- "**A)** Let's start by loading our training data into the notebook:\n",
- "\n",
- "The data that you need for these lessons are stored in a Google Cloud Storage bucket. Before you begin these files will need to be copied from the bucket to your notebook using the `gsutil` utility. For more information, see [NIH CloudLab's documentation](https://scan.cloud.nih.gov/resources/cloudlab/google-cloud-jumpstart/#cli)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3ea8f894-d160-40e1-b6a4-6f837a7ca963",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Download the data from Google Cloud Storage with gsutil\n",
- "!gsutil cp gs://nigms-sandbox/nosi-sfsu/data/* ."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "caab38bd-bbb1-478b-b17c-8bd15bd07802",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This opens the file that contains the training data, data used to train the algorithm \n",
- "S2020_training = pd.read_csv(\"S2020_training.csv\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "cbaf2e30-2eaf-4a21-9932-99d7bf8c22ba",
- "metadata": {},
- "source": [
- "**B)** Make sure that your dataset is loaded correctly, it should contain the county names and all the data highlighted in green shown in our last picture:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3eabc08f-2002-4adb-b380-08c26ef968a6",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This will display the entire dataset \n",
- "S2020_training"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d2eb357a-859b-446d-bfed-514d0c20f819",
- "metadata": {},
- "source": [
- "**C)** We can sneak a peek at what our first 5 rows look like, if your dataset is too big to be displayed."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1d1b826c-2346-4452-80db-cc34a731856c",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This will display only the first 5 rows of our dataset\n",
- "S2020_training.head()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c1fafa09-54f7-422b-9f77-07d816beb038",
- "metadata": {},
- "source": [
- "**D)** Here we can see how many rows and columns the complete dataset actually has. In our example we should have (40 rows, 11 columns)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "af0570f1-4334-4be4-90fc-08aa4c425cb0",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This will display only the number of rows (not including the title of the columns) and number of colums of our dataset\n",
- "S2020_training.shape"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5e1e56f1-3454-4d23-8654-44d4a0d29f81",
- "metadata": {},
- "source": [
- "## **Step 3) Separate the training dataset into features and labels**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "b261c100-55d5-4d4d-b052-16d033395e30",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('qh8C0QRECWU', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "435f149e-a9fe-4b6b-be5d-3033fb6d84d0",
- "metadata": {},
- "source": [
- "Recall that a Decision Tree is a **supervised** machine learning model, therefore we need to specify clearly what we are trying to predict.\n",
- "\n",
- "To do this we need to divide the training data into **labels** and **features**\n",
- "\n",
- "![Label-and-Features.jpg](images/Label-and-Features.jpg)\n",
- "\n",
- "- The RED outlined column is called a **LABEL**. This is a machine learning term that refers to the data that our model will learn to predict.\n",
- "\n",
- "- The BLUE outlined columns are called **FEATURES**, which is the term that refers to the columns we would like to use to predict our chosen LABEL. \n",
- "\n",
- "Because the **training data** is complete, we can clearly separate LABEL from FEATURES. Remember that the training data is only the red and blue shaded regions of our dataset. \n",
- "\n",
- "We can ignore the rest of the dataset for now.\n",
- "\n",
- "**A)** Separate the training data into features and labels:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "7feef3b4-b0b8-41d0-b4b1-c1294753bd3f",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# On the other hand the label will only include summer_2020 cases per 100 000\n",
- "S2020_training_labels = S2020_training[\"cases_per_100000\"]\n",
- "\n",
- "# Notice that in this code we are droping the \"county\" column, because it does not contribute with our predictions and \"cases_per_100000\" because that is our label\n",
- "S2020_training_features = S2020_training.drop(columns=[\"county\",\"cases_per_100000\"])"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "775e163b-ac56-40c8-a12c-89f13bfbce30",
- "metadata": {},
- "source": [
- "**B)** Run the **LABEL** to check that the separation was correctly performed (you should see 40 rows and just 1 column):"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "49d72928-e153-4617-808d-9fb0140ef3dd",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This code allows you to see what the labels look like as a dataframe, after being separated from the training data\n",
- "S2020_training_labels = pd.DataFrame(S2020_training_labels,columns = [\"cases_per_100000\"])\n",
- "\n",
- "# This code tells you how many rows and columns this dataset has\n",
- "S2020_training_labels.shape"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "05b69eb8-d593-453b-9786-432dd425c612",
- "metadata": {},
- "source": [
- "**C)** Run the **FEATURES** to check that the separation was correctly performed (you should see all 40 rows and 9 columns only since we dropped the columns of \"county\" and \"cases_per_100000\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1a75f401-1966-4628-b4fc-ee285bc8a595",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This code shows\n",
- "S2020_training_features.shape"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3c55a9ca-5d66-4bd2-aaa5-82ce6ee305a6",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "display_quiz('quiz_files/quiz2.json')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "6a76b309-f312-425f-a4ac-e2ef7dacbfe0",
- "metadata": {},
- "source": [
- "## **Step 4) Create a Decision Tree object and train it**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d8b5abf2-94fe-4e2d-b24a-3b185b08c88e",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('M6gY_JywOys', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e82423a6-36cc-4898-aacb-727f98e9ffc1",
- "metadata": {},
- "source": [
- "After separating our training data into features and labels, we can now create a Decision Tree. \n",
- "\n",
- "**A)** Create a Decision Tree object"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1af78ddd-dc00-4298-9732-94684886a7b7",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This line creates the Decison Tree with your chosen specifications (what is written within the parentheses)\n",
- "dtr_summer2020 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9a337b10-ecaf-4b2e-921f-5b830ac4f7a2",
- "metadata": {},
- "source": [
- "**B)** Train our Decision Tree using the training data we separated in the previous step"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "ac3b1e35-0609-4f64-bac6-c42537cb3694",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This line trains the decision tree using both the features and the label from our training data\n",
- "dtr_summer2020 = dtr_summer2020.fit(S2020_training_features,S2020_training_labels)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "dbaec858-5a15-4727-b9fb-ac1d6bd08fd8",
- "metadata": {},
- "source": [
- "## **Step 5) Visualize our trained Decision Tree**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c8c03608-f9d7-43d0-abd3-757b6196023e",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('cFk6vmfU48w', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "96d80ff1-924c-4aef-ad9d-68634911deb9",
- "metadata": {},
- "source": [
- "Visualize our Decision Tree by graphing it using the following code "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "68789480-9f14-43ab-b49a-145bf462b132",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Initialize tree data object \n",
- "dtr_summer2020_dot = tree.export_graphviz(dtr_summer2020, out_file=None, \n",
- " feature_names=S2020_training_features.columns, \n",
- " filled=False, rounded=True, impurity=False)\n",
- "\n",
- "# Draw graph\n",
- "dtr_graph = graphviz.Source(dtr_summer2020_dot, format=\"png\") \n",
- "dtr_graph"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "4b6af3a1-3ba0-4759-9883-6d03123548ae",
- "metadata": {},
- "source": [
- "### Let's try to understand what our tree learned!\n",
- "\n",
- "- **NODES** contain the decision that must be made based on a particular criteria. You can see that nodes have 2 arrows pointing away from them. All arrows to the LEFT are taken when the criteria is satisfied, and all arrows to the RIGHT are taken when this criteria is not satisfied.\n",
- "\n",
- "- **ROOT NODE**, this node is what our model determined as the most important feature to consider when making our predictions. It tells you the feature that best splits the data and it's located at the top of the tree.\n",
- "\n",
- "- **LEAVES** contain the final outcome of the decision path. You can see that leaves do not have arrows pointing away from them."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9dd48904-36b7-4747-b9e9-8b6c985ff049",
- "metadata": {},
- "source": [
- "## **Step 6) Make predictions using Testing data with our trained Decision Tree**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "5d0d1906-bc34-4a59-b05f-99167175ac7c",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('LtD93dB5JzU', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c56def25-4c94-4de1-a0a8-02b5075988eb",
- "metadata": {},
- "source": [
- "We are now ready to make predictions for the counties that had the missing labels.\n",
- "\n",
- "**Below is an image showing what constitutes the testing data in our example**\n",
- "\n",
- "![Testing-Data.jpg](images/Testing-Data.jpg)\n",
- " \n",
- "In machine learning we usually call the part of the dataset that only contains the FEATURE columns as **testing data**. \n",
- "\n",
- "The **testing data** is the dataset that is used to predict the missing values of the LABEL column, based on the rules learned during the training phase.\n",
- "\n",
- "Recall that our Decision Tree model has only been taught using the training data (40 counties) and has never seen any of the columns of the testing data (18 counties)."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "20da60cd-48c3-4acb-be33-dc4565d3947a",
- "metadata": {},
- "source": [
- "**A)** Let's load the testing data that correspond to the counties with the missing label and see what it looks like."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d5c75c05-745c-4c0c-bf31-c839e43c9ed5",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This opens the file that contains the testing data features, features = data used to make a prediction\n",
- "S2020_testing_features = pd.read_csv(\"S2020_test_features.csv\")\n",
- "\n",
- "# This lets you see the loaded testing data \n",
- "S2020_testing_features"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "046ef181-0158-4fb3-a726-42046fa4e713",
- "metadata": {},
- "source": [
- "**B)** Lets drop the county names from the dataset and make our predictions!"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c9062cad-cde4-4596-a7e5-08203fd77dd7",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This drops the \"county\" column from our test dataset\n",
- "S2020_features_test_nocounty = S2020_testing_features.drop(columns=[\"county\"])\n",
- "\n",
- "# This uses the tree we created and makes the predictions\n",
- "S2020_labels_pred = dtr_summer2020.predict(S2020_features_test_nocounty)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "33e5fb42-21b8-4983-ade9-83182ef070a4",
- "metadata": {},
- "source": [
- "**C.1)** Let's look at what labels our model predicted and check how it relates to our Decision Tree:\n",
- "\n",
- "![COVID-Decision-Tree.PNG](images/COVID-Decision-Tree.PNG)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2b43dd45-625b-4495-9cb8-2bd08e459d75",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This turns our predictions (which is currently an array) into a dataframe \n",
- "S2020_labels_preds_df = pd.DataFrame(S2020_labels_pred, columns=[\"Predicted\"])\n",
- "\n",
- "# This line adds the county name back, so that you can see what was predicted for each county\n",
- "S2020_labels_preds_df = pd.concat([S2020_testing_features[\"county\"].reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
- "\n",
- "# This lets us see what was predicted\n",
- "S2020_labels_preds_df.round(3)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9c5c9ddc-f972-424a-baaf-e5876ff0d847",
- "metadata": {},
- "source": [
- "**C.2)** Why did the model predict 702.806 for San Francisco County?\n",
- "\n",
- "Run the cell bellow and look at the output, follow the tree as described in the video to see that this county has: \n",
- "- Unemployment Rate =< 0.123\n",
- "- Population > 28453.0\n",
- "- Green_votes_percentage > 0.005\n",
- "\n",
- "Feel free to try another county and check for yourself that it follows these rules, by changing the county name in the code below:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "ff7dd8fa-a9b3-4570-923a-882f198d53b3",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Loading the testing features for San Francisco County\n",
- "S2020_testing_features[S2020_testing_features['county']=='San Francisco'] # change 'San Francisco' to any other county in the list above that you are interested in"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f295afb6-142f-4d8e-9ef3-5658916f3624",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "display_quiz('quiz_files/quiz3.json')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "aff22ea7-270b-47dd-ba7f-a2a62ba1cfe5",
- "metadata": {},
- "source": [
- "## **Step 7) Let's see how our Decision Tree model performed**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "9a11c706-b8ba-4b5c-bfde-eed545a00248",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('0VK4sLz2wrc', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "a9d8738a-f8b2-4679-9ed5-d7748aaf6bf9",
- "metadata": {},
- "source": [
- "Now that we have predicted the missing labels for Summer 2020 cases, let's see how our model did by comparing it with the actual labels!\n",
- "\n",
- "**A)** Let's reveal now our ACTUAL labels by loading them into the notebook"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3cd40255-1bd5-4d11-8f6f-1b9bd29718c0",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This opens the file that contains the testing data labels, label = what we want to predict\n",
- "S2020_testing_labels = pd.read_csv(\"S2020_test_labels.csv\")\n",
- "\n",
- "# This drops the county on our label data so that the dataframe only has one column with county names when is joined with the predicted dataframe\n",
- "S2020_testing_labels = S2020_testing_labels.drop(columns=[\"county\"])"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "4037a417-05e3-43e5-b63e-a3562da3c44b",
- "metadata": {},
- "source": [
- "**B)** We can use a bar graph to help us visually inspect how our model performed"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e7cc9c31-b0cd-4bd6-a544-4c0b143d50c8",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This puts into a single dataframe our predictions with our original test labels \n",
- "pred_vs_test_2020 = pd.concat([S2020_testing_labels.reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
- "\n",
- "# Reorganize the order of columns\n",
- "pred_vs_test_2020 = pred_vs_test_2020.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
- "\n",
- "# This plots the data in a barchart per county\n",
- "pred_vs_test_plot = pred_vs_test_2020.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"darkblue\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))\n"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "0ae29213-1635-4a51-9593-20a59bf2a60e",
- "metadata": {},
- "source": [
- "## **Step 8) Let's try using our Summer 2020 tree model to predict 2021 data**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "fee74988-b620-4941-89f1-1d6ebb5ef0cc",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('2r3ZpwM6xDQ', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e6d838dc-e9fe-4721-b866-07068f5126b3",
- "metadata": {},
- "source": [
- "**A)** Let's load the features information for the same 18 counties, but this time for Summer 2021."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3712f5eb-75ad-4f96-9106-9b51a19c0831",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Importing Summer 2021 data to predict using \"Summer2020 Model\"\n",
- "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
- "\n",
- "# Make predictions for Summer 2021 Data\n",
- "S2021_labels_pred = dtr_summer2020.predict(S2021_testing_features.drop(columns=[\"county\"]))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "1c8d062d-42b2-419c-a703-a3a4537fdfb0",
- "metadata": {},
- "source": [
- "**B)** Let's now load the actual Summer 2021 data and see how our 2020 Decision Tree model performed this time."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "7212fb89-7ff6-488c-907e-0177ccacb6bf",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Importing labels of Summer 2021 data to check accuracy of \"Summer2020 Model\" predicting Summer2021 Data\n",
- "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
- "\n",
- "# This turns our predictions (which is currently an array) into a dataframe \n",
- "S2021_labels_preds = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
- "\n",
- "# This puts into a single dataframe our predictions with our original test labels \n",
- "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds.reset_index(drop=True)],axis=1)\n",
- "\n",
- "# Visualize performance for Summer 2021 predictions\n",
- "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5cc98d6b-7118-4d3f-a144-a8b8e55d2cde",
- "metadata": {},
- "source": [
- "**C)** Another way to look at the difference in performance between predictions made by the model for 2020 vs 2021 data is to observe their difference in errors.\n",
- "\n",
- "We can see that for 2020 the histogram (Blue) of errors is closer overall to 0 ranging from -500 to 500, whereas the histogram of errors for 2021 (Orange) are all over the place ranging from -1000 to 2500"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "332a67fd-30ec-4acd-bba3-7ce9855a3786",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Create columns holding error between actual rate vs. predicted rate\n",
- "pred_vs_test_2020['residual'] = pred_vs_test_2020['cases_per_100000'] - pred_vs_test_2020['Predicted']\n",
- "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
- "\n",
- "# Plot errors on histogram\n",
- "plt.title('Cases per 100k Prediction Errors')\n",
- "plt.hist(pred_vs_test_2020['residual'], alpha=0.5, label='2020 data')\n",
- "plt.hist(pred_vs_test_2021['residual'], alpha=0.5, label='2021 data')\n",
- "plt.legend(loc='upper right')\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "78323de1-ec64-43d3-a21d-cf93d97db995",
- "metadata": {},
- "source": [
- "**D)** A more formal way to calculate the performance for the model is to calculate the Root Mean Square Error (RMSE). Feel free to browse the **(Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data** for more details about this particular metric."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "374a7f45-8bea-4b97-9ba5-12eec94e34a5",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This prints the RMSE value for the performance of the model using 2020 Data\n",
- "print(f\"RMSE on 2020 test set: {mean_squared_error(pred_vs_test_2020['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2a4fe185-6b1c-47e0-ba16-4a181d47661c",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# This prints the RMSE value for the performance of the model using 2020 Data\n",
- "print(f\"RMSE on 2021 test set: {mean_squared_error(pred_vs_test_2021['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "6ef11332-1228-4fe5-97f0-a4fb5ae924bf",
- "metadata": {},
- "source": [
- "#### Please run the additional cell below to save a csv copy of the predicted and actual values made by our 2020 model for the years (2020 and 2021)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e2ba12ba-dfb4-4b44-98c0-e2fdb0226a86",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Lets's save our comparison dataframes as a CSV file for a more quantitative analysis\n",
- "# We will revisit these in the next notebook.\n",
- "pred_vs_test_2020.to_csv('Model2020pred_vs_test_2020.csv', encoding='utf-8',index=False)\n",
- "pred_vs_test_2021.to_csv('Model2020pred_vs_test_2021.csv', encoding='utf-8',index=False)"
- ]
- }
- ],
- "metadata": {
- "environment": {
- "kernel": "python3",
- "name": "common-cpu.m114",
- "type": "gcloud",
- "uri": "gcr.io/deeplearning-platform-release/base-cpu:m114"
- },
- "kernelspec": {
- "display_name": "Python 3 (Local)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.10.13"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/3- Practice.ipynb b/3- Practice.ipynb
deleted file mode 100644
index 13a394b..0000000
--- a/3- Practice.ipynb
+++ /dev/null
@@ -1,203 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "423e9402-a82b-4834-a3a0-931a2c685ba7",
- "metadata": {},
- "source": [
- "# **Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n",
- "\n",
- "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
- "\n",
- "* You DON'T need to copy-paste the comments from the original code (the green text that is preceded by \"#\"). \n",
- "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 data.\n",
- "\n",
- "### **Walkthrough Solution:**\n",
- "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "536f933f-a925-4fe9-945b-87c48fb98ecb",
- "metadata": {},
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b1e1a46d-6e6c-4386-a0ca-0fcb19009e26",
- "metadata": {},
- "source": [
- "## **1) Repeat Step 1 (Importing Necessary Packages)**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "ce366147-fc2c-416d-8f28-325bacbb28e3",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "54218111-757c-4b75-80aa-2754ffd916df",
- "metadata": {},
- "source": [
- "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
- "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "5812834c-28ec-492e-bebf-0e5cc34d4270",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "d98ca761-0f7c-4f5e-acf4-05331b993b22",
- "metadata": {},
- "source": [
- "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
- "SKIP:\n",
- "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "966fee45-4f0d-4f98-8819-a9543a6ab0bc",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "26d25871-caa2-41b5-b800-6224932ec759",
- "metadata": {},
- "source": [
- "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "14883b4f-d547-4de2-b15e-1602f1c279ae",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "d73e8261-7466-4845-b847-1088e2610828",
- "metadata": {},
- "source": [
- "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "b4eb80ad-bc43-4405-9dd4-c671933d6597",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "f7c646d7-46c5-475c-a446-5decacb0f6df",
- "metadata": {},
- "source": [
- "## **6) Repeat step 6A, 6B, 6C (Load Testing Data and make your Predictions)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "8dcb9a83-f172-4194-b2aa-0edcc534a191",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "fbefce65-467c-46d0-abde-d02662c1d071",
- "metadata": {},
- "source": [
- "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c3c6d248-22da-4ddf-b270-bdb5092863fd",
- "metadata": {},
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "markdown",
- "id": "5b90d2e6-520b-4dd7-810d-85ae7586b009",
- "metadata": {},
- "source": [
- "## **8) Extra: (Calculate RMSE and create Aggregate error histograms)** \n",
- "\n",
- "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2d5b870e-9d3d-4a29-a237-0a7e0155d6c6",
- "metadata": {},
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "environment": {
- "kernel": "python3",
- "name": "common-cpu.m108",
- "type": "gcloud",
- "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
- },
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.12"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/4- Practice - Answer Key.ipynb b/4- Practice - Answer Key.ipynb
deleted file mode 100644
index 96dc65e..0000000
--- a/4- Practice - Answer Key.ipynb
+++ /dev/null
@@ -1,367 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "328b4b3b-e1b3-4490-8f76-0241ed6c3b5c",
- "metadata": {},
- "source": [
- "# **Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n",
- "\n",
- "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
- "\n",
- "* You DON'T need to copy-paste the comments from the original code (The green text that is preceded by \"#\"). \n",
- "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 Data."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b8848232-eb1d-430b-b854-2077de1862fd",
- "metadata": {},
- "source": [
- "## **1) Repeat Step 1 (Importing Necessary Packages)**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "fa4b42e6-ac15-4f05-be80-43593f07b5d2",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Data Wrangling Imports\n",
- "import pandas as pd\n",
- "import numpy as np\n",
- "\n",
- "# Machine Learning Models Imports\n",
- "from sklearn import tree\n",
- "from sklearn.tree import DecisionTreeRegressor\n",
- "\n",
- "# Model Evaluation Imports and Visualization\n",
- "from matplotlib import pyplot as plt\n",
- "!pip install graphviz\n",
- "import graphviz\n",
- "\n",
- "# Quantitative metrics of Model performance\n",
- "from sklearn.metrics import mean_squared_error"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "6d623983-10a3-4f42-bebd-4045e2c29e98",
- "metadata": {},
- "source": [
- "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
- "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "de7771a0-9d83-4d94-8976-9595b83de3a2",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from Step 2A that will load our Summer 2021 training data\n",
- "S2021_training= pd.read_csv(\"S2021_training.csv\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "497c33d3-6611-476b-99e1-1e9309105962",
- "metadata": {},
- "source": [
- "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
- "SKIP:\n",
- "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e12b600c-c282-4d2f-99b4-d3a6f78ab4fe",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from Step 3A that separates the FEATURES & LABEL from the training data \n",
- "S2021_training_labels = S2021_training[\"cases_per_100000\"]\n",
- "S2021_training_features = S2021_training.drop(columns=[\"county\",\"cases_per_100000\"])"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b44cea28-b4c6-4aa0-b82a-9f4c6edecf3d",
- "metadata": {},
- "source": [
- "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "b2ab294d-1f15-4adb-9e96-f3122fd146bb",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from Step 4A that will allow us to create our NEW Decision Tree\n",
- "dtr_summer2021 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "85b2db25-ee63-4ab7-8452-fcb52a013160",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from step 4B that will train our NEW Decision Tree\n",
- "dtr_summer2021 = dtr_summer2021.fit(S2021_training_features,S2021_training_labels)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "070c4b35-1119-4e9b-ae6d-bd00c1734f5b",
- "metadata": {},
- "source": [
- "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f9e07c94-5d4d-45b5-96d2-625fe9693c82",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from step 5 that will let you see the NEW 2021 Decision Tree\n",
- "dtr_summer2021_dot = tree.export_graphviz(dtr_summer2021, out_file=None, \n",
- " feature_names=S2021_training_features.columns, \n",
- " filled=False, rounded=True, impurity=False)\n",
- "\n",
- "# Draw graph\n",
- "dtr_graph = graphviz.Source(dtr_summer2021_dot, format=\"png\") \n",
- "dtr_graph"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e23a2bbb-591c-47f6-ad24-c413ca634cba",
- "metadata": {},
- "source": [
- "## **6) Repeat step 6A, 6B, 6C.1 (Load Testing Data and make your Predictions)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "d4ce9399-0aa8-48d8-a540-3e8cbcdf7af6",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from step 6A to load and see your Summer 2021 testing data\n",
- "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
- "S2021_testing_features"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "541994e6-e8a8-4c89-b8a7-7baae502bb8a",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from step 6B to drop the county out of the testing data and make your predictions!\n",
- "S2021_features_test_nocounty = S2021_testing_features.drop(columns=[\"county\"])\n",
- "S2021_labels_pred = dtr_summer2021.predict(S2021_features_test_nocounty)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "66fceb09-7728-48ef-912f-4d6161d96f61",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from step 6C.1 to look at the labels that our new model has predicted\n",
- "S2021_labels_preds_df = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
- "S2021_labels_preds_df = pd.concat([S2021_testing_features[\"county\"].reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
- "S2021_labels_preds_df.round(3)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "188fd7b3-ad8c-4aae-a835-7dbb2be8bfe8",
- "metadata": {},
- "source": [
- "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
- "\n",
- "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3e355d76-dabe-489a-8c81-0d839235d183",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from Step 7A to load our ACTUAL 2021 labels and drop the county since it's not part of the labels per se\n",
- "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
- "S2021_testing_labels = S2021_testing_labels.drop(columns=[\"county\"])"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "ffd2bed9-c1bd-4c30-b869-cd8cf3c18870",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Copy-paste the code from Step 7B to make a bar graph and inspect the Accuracy of your new 2021 Decision Tree model\n",
- "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
- "pred_vs_test_2021 = pred_vs_test_2021.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
- "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"hotpink\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "4f751ca0-3db5-4f55-8acd-74afe0be5d9b",
- "metadata": {},
- "source": [
- "### **Walkthrough Solution:**\n",
- "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "951a16d6-d07a-4a83-8e6c-61f256ace1d7",
- "metadata": {},
- "outputs": [],
- "source": [
- "#Run the command below to watch the video\n",
- "from IPython.display import YouTubeVideo\n",
- "\n",
- "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "45d7f1cf-2f56-4a9f-a3e2-5b1923c47066",
- "metadata": {
- "tags": []
- },
- "source": [
- "## **8) Extra: (Calculate RMSE and create Aggregate errors histograms)** \n",
- "\n",
- "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "623e4e38-9baf-416b-92d1-ffa1312fb20a",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Creating residual for our new 2021 model\n",
- "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
- "\n",
- "# observe now new model with new column\n",
- "New_model = pred_vs_test_2021\n",
- "New_model"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "745a6b98-f86f-4e3f-908e-b5565871c6c2",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Loading old models that will test 2021 data\n",
- "Old_model = pd.read_csv(\"Model2020pred_vs_test_2021.csv\")\n",
- "Old_model"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "da31b101-37f8-400b-ac4e-631d6b6af428",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Plot histogram of error aggregates for both the old and new model\n",
- "plt.title('Cases per 100k Prediction Errors')\n",
- "plt.hist(New_model['residual'], alpha=0.5, label='Model 2021')\n",
- "plt.hist(Old_model['residual'], alpha=0.5, label='Model 2020')\n",
- "plt.legend(loc='upper right')\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "82bf9a78-4062-4640-8561-76c623ede7bf",
- "metadata": {},
- "outputs": [],
- "source": [
- "# This calculates the RMSE for Model 2020 (OLD MODEL)\n",
- "print(f\"RMSE for Model 2020: {mean_squared_error(Old_model['cases_per_100000'], Old_model['Predicted'], squared=False)}\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "abeda9a5-5fa0-4544-8586-14ec531448ad",
- "metadata": {},
- "outputs": [],
- "source": [
- "# This calculates the RMSE for Model 2021 (NEW MODEL)\n",
- "print(f\"RMSE for Model 2021: {mean_squared_error(New_model['cases_per_100000'], New_model['Predicted'], squared=False)}\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f5d9a3cf-c8c2-4fc0-acfd-f4c1aa7705d2",
- "metadata": {},
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "environment": {
- "kernel": "python3",
- "name": "common-cpu.m108",
- "type": "gcloud",
- "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
- },
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.7.12"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/AWS/1- Intro to Machine Learning Decision Trees.ipynb b/AWS/1- Intro to Machine Learning Decision Trees.ipynb
new file mode 100644
index 0000000..06a29cc
--- /dev/null
+++ b/AWS/1- Intro to Machine Learning Decision Trees.ipynb
@@ -0,0 +1,1398 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "85be35e8-c134-4ba4-ac3d-fe95bc106ff4",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# **Introduction to Machine Learning: Decision Trees**\n",
+ "\n",
+ "# Overview\n",
+ "***Introduction Video***\n",
+ "\n",
+ "It's important to note that there are other machine learning techniques, but the aim of this notebook will be to have a basic understanding of one of the fundamental techniques used: Decision Tree. This is ideal because Decision Trees are the basis for more complex models such as Boosted Trees or Random Forests. Below we have a general introduction video to machine learning by Lorena Benitez.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "bfb2d237-44af-43c8-8968-eaa340d7ecc3",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGRoeHRsfHx8gICAgICUlJR8lLicxMC0nLS01PVBCNThLOS0tRWFFS1NWW1xbMkFlbWRYbFBZW1cBERISGRYZLxsbMFc2NT1XV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXXVdXV1dXV11XV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAQIDBAUGB//EAEoQAAIBAgMFAgsFBAkDBAMBAAABAgMRBBIhBTFBUZFh0QYTFBUiMlJxgaGxQlNyksEHFjPSIzRUYpOisuHwJEOCF3OD8WN04jX/xAAYAQEBAQEBAAAAAAAAAAAAAAAAAQIDBP/EACMRAQACAQUBAQADAQEAAAAAAAABAhEDEhMhMVFBImFxoTL/2gAMAwEAAhEDEQA/APn4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOlDYdWTsnDq+4mpsOrF2cqfwb7jey3xndDmA6Pmar7UOr7iY7Eqt2zQ+LfcOO3w3Q5oOvPwcrxV702uxvuMPmar7UOr7hx2+G+v1zgdHzNV9qHV9xkWwKzV1Km/i+4bLfDfDlA6T2LV9qHV9xHmar7UOr7ibLfDdDnA6Pmar7UOr7h5mq+1Dq+4bLfDdDnA6Pmar7UOr7h5lq+1Dq+4bLfDdDnA6PmWr7UOr7jKvB6ta7lTS7W+4uy3w31+uSDqeYqtr5qfWXcV8yVfah1fcTZb4b6/XNB1I7ArPjDq+4S2FVX2qfV9xdlvhvr9csHUjsKq/tQ6y7iHsOr7UOr7hst8N9frmA6XmSr7UOr7ifMdX2odX3DZb4b6/XMB114OVrXzU7e+XcR+71b2qfWXcNlviclfrkg6fmOr7UOr7h5jq+1Dq+4my3xd9frmA6fmKr7UOr7iXsGr7VPrLuLst8TfX65YOp5iq+1Dq+4eYqvtQ6vuGy3xd9frlg6fmOr7UOr7jI/BytlzZqdvfLuJst8TfX65AOpHYNV/ah1fcR5jq+1Dq+4bLfF31+uYDprYdX2odX3EvYVX2qfWXcNlvhvr9csHTWwqrfrQ6vuI8yVfah1fcNlvhvr9c0HUjsGq/tU+su4S2DWW9w6vuGy3w31+uWDpeZKvtQ6vuJjsOq361PrLuGy3w3w5gOtU8Hq0dc1NrmnLuMXmWr7UOr7i7LfDfX65wOj5lq+1Dq+4eZqvtQ6vuJst8N0OcDqz2BWSTzU9e2XcUWxavtQ6vuLst8N9frmg7n7rVvvKP5p/ylf3ZrfeUfzT/AJRst8Tkr9cUHa/dqr95R/NL+Uj926v3lHrL+UcdvhyU+uMDsfu7V+8pdZfykfu9V+8pdZfyjjt8Tlp9cgHX/d6r95S6y/lH7u1fvKXWX8o47fDlp9cgHY/dyr95S6y/lIl4O1Um/GUnbtl/KOO3w5afXoV/RQ/vP5I1b3E5uTuyD2RDjEJCMGJm405tb1FtdDiedK3tfJGLakRPbUVmXp6OIlB9nIzOnCrrH0ZcjyXnSt7fyQW1K3t/JGeaF4pekqQcXZoU5tO6PPy23iGrOaf/AIx7ii2rW9v/ACoc1Tjl6mSU9d0vqa8k07M8951r+3/lXcTLbFd75/5Y9xJ1anHLvg8950re38kPOdb2/kicsLxy9EWhTcnZI8350re38kXjtnEJWU0v/GPcOWE45ep9Gn/el8kYZ1HJ6nmvOtb2/wDKh51re3/lQ5YOKXp4QbWiJtGPazzfnzEWtnX5Y9xj87V/b/yocsJxS9NKbZU8352r+3/lQ861/b/yocsYXjl6inF70UkrM84ts4hfb/yx7ju0arlShJu8nGLb7bG4vFuoYtSa9yzAxZ2M7Lhlv4e7g0uBjk2pK5rQryjudhKvJ738jWemcdss1qyDE6rfEjOzPbTZpK8kZqy9HdxNFVHzLOtJ72WJ6YmO1yY7zDnZKmzMS0yveblaMlBLS1tTnKb5mSWJm+PyRqJSYZYtKLdjHZczH4x7iMxMqyZGQVUmS6jHSrRlYtaL7H8jDcXESMkqbW8mNRrR6rtKRqNcSHIdDNkUvV38i2Gp+k771wNeNRp3W8sqrve+pYlJ7jDaq1HFJ896MTpxmrx0fIwzqOW93Kp21RZlIjEEk07MJX0LTqOW8rGTTuiN5Z6tOeVJ6pciYwjTWaesuETFHEzW5/JGOUm3dv4lzDERM9NieOir6SbXC2pV46Fru63KzTvqYlgoWbdVcE2mtGT4ulDKrTk777SsuPI8/JZ0jSqzzqxSvq3yUZXXv0NZYtS9WLe+65F51JrRUmuSyt2fx+BjlVq2u8yfFJLdzatfeZ5rNcVfjawWHeJqQp07wcm7ylH1Ulduxs7S2YsM1ecqkLpN2jmV9ztpdGbwTSqVWp5lenPi0/WXPUzeEDhnqX9mKWr3rRfOxztq3znLrGhTbjDj1Zwg9VVtweSOr5L0iFVhpanX+NNafM16dRVKmSo27tOKvfK7dGi8qklNxkoyssvquTS7bWR05pco0afG/TpRlFSSkr+1ZPoSqdNJtu6W8ihGnKEZNRUlG7SSsmr2W7d9DJKCUPSSbSunHt7Sxq2mVnRp8coNkMg9U/GGLGfwqn4X9DzR6TGfwqn4X9DzZ5tX120/AAHF0AAAAAAAAAAAAAAAAAAAPUYX+BT/AAR+h5c9Rhf4NP8ABH6HXS9ctXxcAHd5wAAACUAAJQCwJAUBJNgKk2JsTYCthYsLBEWFi1iLBUWFibCwFbAsGgKkFrCwRUE2IsFQCQgM0VJJZd7k3msmlwb5vh1Ili21HT0U7Sb3z013bjPGhB+lHetG4y0VuCd+HaY1CTs4OHDMk9FrdW4XueLL2Jlh7u1rJqze+75Liv8AYipVyQUnFNtLsulez92u46mG8WnmrX8Ve+izLdfle2ljhbZxUZVP6NNRulFPfaxYjLdYzPbY2VVqSxFGUE7qaTt7Lumb3hph5vLNu0cy+n/2ZvAbDXqVakvsxSXvk3+iRteHithqfbN/6WZmvbc4zh5DDSpzyxcFF3XpJyu0t6S5v9TryqNyjGnBU8ylms8yl2rfbl3Hlo1ry6HbwNTxtJQd7Rlo4R1S5t7muaLPTnaHQwlaybcXFLN6EsrceG61nvZNao9Y5W1LTO9Errl3XNrC7Pp1YRhKtOMtHlipWtwatw4Fa2zbb56pW36u/O/N3ZaxmYYmenDAB7nmYcZ/Cn+F/Q82ekxn8Kf4X9DzZ59b2HbT8AAcXQAAGXC4adapClTWac2oxXNs+i4D9nWHVNePqTnUa1yNRiuxaanB/Zvh1PHyk1fJSk1720r9Gz6kB872t+zqcbywtXOvYqaP4SWj+R5WvsLGU5ZZ4asn2Qk11Wh9uAHxKnsHGS3YWv8AGnJfVGSXgzjkr+S1fhG59pAHwfEYSrSdqtOdN8pxcfqYT73WowqRcZxjKL3qSTR4/b/gDSqpzwn9FU35G/Ql7vZ+gHzQGXE4adGpKnUi4Ti7OL3oxAAAAPVYa3k9Lnkj9Dyp6vDpLD0ueSP0Oul646vkAAO7gAEgCUQSAJMlOhOSbjCUkt7UW0iiC4LEpAlARYmxNiQIsTY0qm0IrOrWlHRX4sUtoxajf1pOzSW4mYXbLdsLEgrJYWJFgqLCxNi1OnKclGKbk9ElxApYix0K+x8RTScqbtp6tpfQ0ZJptNWadmnwZImJMSpYWLE5PgUY2iLF3EiwFLCK1Rv4LAKazSbtwS4kY7AKEc0b24p8DjOvTds/XaNC+zfjphxNe6UYrc5rRNS36Zl8EU8qlplcY2i9Prx195WGInHSM5K++zauJYio99ST4es9xY04iF5YYpbRnWjKDm4UlZtRvHxm9Jb9VdbzmTquda992vvZ1ZzcvWd9La8uRh8TFbopfBEjTw1GvEfjY2LtGMJVI+lmdnZNq6sT4U4xunSvfTMrOTZrwgottJJve1xKzoxlvinx1SNbE5u8uROhFUqVWM03NzjKOl4NPTqjv4KblRioycLJZ16KVr2evbo7GtHDwW6EeiNudNeLU4JJrSVufBmeGZ/UnWj43aUqqtN7016XH39i32T5G3W2jUyScmnePLLp28+ZxaWIqJ6Slrv13kVK83dOTtyuWNKY7JvE9MNwiGSd4cmHGfwp/hf0POHosZ/Cn+F/Q86efW9dtPwABxdAAAet/ZvTqPHuUF6CpyVR8k2rfG6+p9SPPeA+zFh8BTbVp1f6SXPX1V0sehAA8p4ReG1HCSdKilWqrR6+hB8m+L7EeOreGu0Kk1asqavuhGKXzuB9cIPk9Dw3x9KbvUVRX3Tit1+yx7Xwc8MKGNapyXiq3st6S/C/0A9IQSAPN+GPg3HG0XUgrYimm4tfbXsP9D5K007NWa3o+/Hyfw/2WsPjnOKtCss6S4S3S+evxA8wAAB6rDfwKX4I/Q8qerofwKP4I/Q66frjreQAA7uASgSgCNrBYGpXllpxbtbM9EkvezVPQeDW0qVKM4TkotyzJvc9LWv8PmZtOIarGZerw9GNOChBWjFWSRr43ZVCtG0oJWu046NN72YPO9D76nvS9Zb3uRk84Q9uPU83b09PH43B1KFSUJRurvK3H1lfemjXdPik/dyPUbapwxVOEc+W0sycZW3Jp/U5VLwaztqNWo2t9qj7jtGo5TRzMr5Poam0m1DRuMk00lvep6P905e3V/xH3EfunL26v+I+4byKPLbP2Z41OpUzNNuyWjfNtkbQ2d4m1Snmtfc9XF8GmeoqeC0optzq2X/5H3GqtjRf/cq/nJHazOJc3Z2eULzd23ddi7Tb0W7qbGz9nN4irSTlKMfFNuUtVGWbNZ/BHVqbCjlmoSWZyTi39mPJmouxNc+ODcWMuJo+LqShdSyu11xMaOjCD0/g/sqVF+NqWvKNkuMb8zzB6Sl4Q08qzKSlbWyurmNTOOm6Yz29CcvbOyI16bcIxVXenuvzTZ5naEvK8XN+MqKCp03FRllte99DGtkRf/drfnOUVn11tavksUqE4TcZRaadmrFZQa4PobfmKP39Tj/3Hw38DPHwWlJJqpVae5+Mfcb3sbIcxxfJ9CuR8n0Ov+6c/bq/4n+w/dOft1f8T/Yu842LBT/o8uqaa6NjH1LUnHVuTt0er+Rr7R2HVw8b051XPgs+9cTDR2bUlSlUrVKqmlJpZ+C3Hl4c6m56ufGls/fGoC61jGTd24Qb97gm2yLHth4JUDJKlQIJIAkz4aolJxfqyVmYCr3moMZbGTK5R4pmKrvNip6cYz4+rI1JbzU+Fe1SbNmVTg98be4yU4Rd7S6khqZc/F/wp/hf0POnp8dQkqVR71llu9x5g82t676fgADi6BubIwTxGKo0V9ucU/dfX5XNM9n+zbZjniZYlr0KUXFPnOXcr9UB9KhFRSS0SSSXYeY8OtvPCYdU6TtWq3Sa3wjxfv4HqT5F4e4p1NpVU91NRgvda/1bA862dDYOBliMZRpRV7zi3+FO8n0OebeytpVMJXjWpWzxvvV009GmBueFeAeGx9aDjaMpOcOTjJ6fqvgcmEnFpptNNNNb0+Z6rZLe2tpp4v1VTby0/RSUdy912bXhv4K0sNCFbCwajqqkb3stLS11426Aer8DtueW4VOb/pqfoVO3lL4953z5b+zfFOGPlT4VKclbtjqv1PqQA8Z+03D5sJRqW1hVtfslF/qkezPLftF//wA1/wDu0/1A+UgAAesoL/p6P4I/Q8meso/1ej+CP0OmnPbhreQgAHocUkoglACUQSgq79Rf+7R/1HdOE/U/+Wj/AKzvGP2Wp8hv4NLxa05mxHTdp7jXwl/FaWvra+65r6v+sOS7FpT6r9ThaO3eLYrDfq4hxSbctWkrPiyynJ8X1MOROdOOlopz03brL6voY6mRSfi3LPfVQ1V+2+hMLluSvZ3bej4nLnOMVeTUV2tJfM6NNyyenbNZ7jze1KUZ4jDxmrxy1dOhunTF+8NjC4lQrYuanBZqVJQeeOrWa9te0wYbEZKsJqpFarN6cdVfW+o810Pul8yfNdD7qPzN7Zc91W1tutCcouFSi4pP1ZxvfjfU5el/Wj+ePebPmuh91H5jzXQ+6j8yxuhJmstZ29qP5o9405x/NHvNnzXQ+6j8wtlUPul8y5sfxYsJJKtNuUUnTppPNHer3W86VN3cbaq61W7easdj0ONKPzI2PFKDS3KvUSXJKW4ncdSs4nuHVnibNrI+PAt5ZK2ikuCdnZfC5r1Kksz14vl3j+k/5b+Y82J+f9ejdX7/AMZZY6ok25aLe8su8wx2vUfqqpJe1GlJp+55tSs6bq1Mn2YqLadvSk9y37lZv4o3PJZPfZe9J/qWIn9SbR+dscsSqitmvJb04uMku1PUwYn+HP8ADL6GfaOFtBOGk43cXz5p9jNapUUqMpLdKm2vjE718crevLUsUs0adm3lp68EvFRNg5+HpzliEqe9wo6au68XDRJHpKOwqs4xlpaUXL3W+y1zZqs9dpevfTlSKsz4nDzptRnFxeVOz7TCzbmggkgIGdYSUlmbUVzbMBWUm97NQYn8b0XThCUc2ZvkjRlvL09xaVCT3Jmp7grGJYWrGSEFbUyLF+1FMyQqwa9S3uEYkmZc/GtqnUV/sy+h5s9Vj403SqtSd8ktH7jyp5tb16NLwABwdXS2DsWrjq6pU1aK1nPhCPN9vJH2LZmz6eFoxo0laMV8W+LfaeI/ZxtijTU8LO0Kk55oSf29Est+enzPoIEnx3w2puO08Rdb3GS9zij7EeC/aVsdyUMXBXyrJUtwV/Rl87dAPE7IrU6eKozqwz04zi5R5q53PDvYsMLiI1KMVGjWjdJboyW9L5M8uer8Hdu4zE4ijhpyp1oNpWrU4zyxS1a43sgOd4JvFLGQ8k/iWaldXiove5dh9X2jg/HYarTdnOdGUL24tP8AUx4nDvDQc8JhqcpNpzirU3KK5aWb7GbWKbdCb1TdOXvXogfMP2e0W9px/uQqN9LfqfVzxv7OdkOlQliZq0q1sv4Fx+LPZADyH7S6jWBglulWjf4RZ68pVpRnFxnFSi9GmrpgfAweh8NNhRwWKXi9KVVOUF7LW+P06nngB62l/VqH4F9DyR66n/VqH4I/Q3T1w1vIUAB6HFJKIJRRgqa1YRdRU4u95N6L3mCDbjUfjrOGW0W9al3bT3bzfUU76cDe2bgYVnJSussM2lt97cjFo/XWs/jg1JyUINVszlduKesHF6X+o84Vvvp/mZ7OHg5Rkms01aS19HhryNpbEh7b/JS/lOWXXDxuzNpV3XpRdaplc1dZnZntKcm2rSbV7NX/AEIWxorVVJJ/gp/ymo8NUlOpTpzmpq2WU4w8W21felc5XpumJiXSurFIxMOtGmlJy4tJe5IucXzPtL7+h+WXcPM+0vv6H5X3Gume3YnufuZ5zH/1rD/hq/obfmfaX39D8r7jQxWFr0sVh44icJycarTgmla0ew3WYYtE+s2Pm40Ksk7NQk0+TsYqE4UYU4+lKU1msryk9Fd68CNr0VKhOTveEZSWul0r6rc93EtaeaFWMc96ajJJpNbmmr6HZwjxkqYyMbK023HNlUW2o82uAeMh6OXNPMsyyK/o8yFSqKo6qhfNCMXHMrxabfTX5E4fCyo5Msc/9GoOzSs02768NWMriDDYi9KEpXcpZrKK1dm+HusMY4ukqsbqUZRV7tW9JJprvIjhJqNNuLbipqUYzyv0pXunfsLYmGTDTSjlu1dN5nrJJ3epMrEQ3zn7JXoy/wD2Kv8AqN2lSUEoq9lzbb6s0Nm4DFVYTdCpThFV6yamndvN7mZtOFrGXZlRnd6/KPcT4ie7N8o9xo+Z9o/f0Okv5R5n2j9/Q6S/lOOIdmxh6Mo16sVKzahLctVaz4c18zb8VU9v6dxynsPaDlGTrULxvZrMtHvXqmtPD7Si2n8Gqbkn7nFDEfVz/TtYp5IJyd7XbfYcuEWsLZ7/ABT/ANJqYqli4Um8TNLNpSpqKc6k96TXLmbNDZuOr0lKNfDuMk07a24NXStdbjcTEMWrMzloeDuCpvE06jz51TpuLjuv4uO/stp8TobR2nXVeSjJxUZWS4ac+ZoxU8JXVNy1gqcJuHFKMbpX9x2Hj8HVcalSKUnJx14JbpSXI1j9YmWv4QU81KnW8X6UlHNK+7TSNvj8jz7OltfHqvJWWXLdes2pfA5rOlfHO3qCCSDTIbEZUYpXUpMwIyRwNWeqjp2tFhcZZo4xL1acV79RiK0p01K9rOzsWp7KqcXFfEzw2daMoua1tuW43lnZ3049jYp7katzLRimmImGrMOO9Sp+GX0PNnocY/6Kf4WeePLreu+l4AA4urZ2b4vyil45tU88c7W9K+p9zpTjKKlFpxaTTWqaPgtJxUouSbjdXS0bXFH2zYGLoVsLTlhk40kssYtWy20sB0SlWlGcZQmlKMk009zXIuQB8y8I/AarRk6mETq0nd5Ptw7O1fMt+zzCeLxGIxFZOEaNNp5lbK29Xr2J9T6YUnTjJNSimnvTSafvA5Pg5tWniaCkqsZTlKcnByWaF5NqNuxWOxJJpp6p6Mx0sPCHqQjH8MUvoZQIjFJJJWS0S5Eg834Z+EKwWHyU5WxFTSFt8FxkB6QHlNheHGGr04rETVGslaWbSEnzT4fE3toeF2BoQcvHxqPhGk1JvpoviB579qNWOXDQ+3ecvhoj58dHbu16mOxEq1TThGPCEVuRzgB66mv+mofgj9EeRPX0/wCq4f8ABH6G6euGv+MYAPQ4pJRUlFF4vnuZenVlDc5RfOL4dxiLRk1/urhXd8HcalWkqlSTTi7Z5aXuj03j6Xtw/NE+fqXC/RENHO2nmXSt8PoPj6Xtw/NEtCVN+q4O3Jp2Pnlj1Pg7s6UIVHUi4uTja/FJf7mJ08frUXzLpYjBueIo1lWnFU1K9NP0al1x9xtyqRSu5JLm2kYvJI82am1MBnw9SMbuVk0udmmc3XLd8pp/eQ/MjzfhDNSx2FcWmvF1dzT5HGnTcZOMlZp2a5M6GztieVQzqrKlKEpRWVLW6idNm3tz37umxKlmTTV0000+KJw2EjSjljmt/ek5W7NeBl/dWf8Aba//AD4k/urP+21/+fEvJCcc/QD91p/26v8A8+JVeDErteXVrrerq6+Y5DjlcxVsPGbTld24XdviuJf91Z/22v8A8+I/dWf9tr/8+I3nHK5fwWqxjRrZpJf9TW3tLijF+60/7bX/AOfE5m0dmrCuNPO53vNyktfSet+hM75wuNnb2XlEPbh+ZF1Nc11PBUMNKpJRhFye+y5cz2tHBxUIp30il8jNq7Wq33MezsK6KqKVeVXPUlNZ36if2V2GzKvBb5xXvkkU8ljzZ5/wkwMlJVIxbgoWk+Vnp9SRGZatOIy7OPhRxFGdOVSKzJpSUleLfFHlPB7aUsFiZYas14uUrXTvGMuEk+T7jSW5mvi6OZXW9fNG503ONR0tsVYzxVWcXeLkrNcbJL9DUU+1r3bjXw0m4JtO25PnYyHWvjlb1MmVYDKyggkgqB2sPN+Lj7kcU6uHfoR9yDdWwpPiRxMdybh0cAyUb62VzHJ6mzSforXcbj1xtPTRxn8Of4WefPRY93hU/Czzp5tb130vAAHF1D694COPmujlto539+dnyE2cHtGvQd6NWdO+/LJq/wAAPuwPjlLww2hDdiZP8Si/qj3Pgh4SVsXKdHE03GpFZoyUHFSXFPkwPVAAAAAPHeFHht5LVnh6FNSqxtecn6MW1fRcd6PnGNxlTEVJVasnOct7Z6Hw52NVo4meJlZ061SWVp6rTc/+cDy4AAAAAAPXU/6rQ/BH6I8ievp/1XD/AII/RG6evPr/AJ/rGAD0OQSQSUSTHeVRIF4xfBF5P6sop80n1IcgrNSozn6kZStvsmzJ5JW9ip0Zt7Ox1ONLxc6k6es75Y3U80bK/G6OEti0v7TF/wDwzMTM/G4iPrp+R1vu6nRmzgNk1asrSzU4re3fojh+ZqX9oj/gzPS+CVKnRjUpRqKcpSz+q46WStrv3fMzNp+NRWPqMdsyjTTUa6zrVxm1r3Gfwf2lSownCpLLeWZOzaeiVtPca20dk1fGVKmjhdyzOSWm85Vy4zDPkvbee8N96uku4ee8N96uku48TcvSWaSjdK7Su3ZL3k4oa5Ze7oY2nUjmhJNbr6mvRwlCGIqYiN/GVFFSd3ay7PgauzJ4elQhCdalmV72mrats2vLML97T/OjlMdusW6Za+0KNO2eeW97aMw+e8N96uku45+2/EVqcfF1qWaMr6zWqt/9Hmrm60zDFr4l7bz1hvvVz3S7jzu3sbCvWTpu8Yxy3ta7u3oc5O9rb1u7ewe6Lvy4HSKYnLE3mYwRTe5O9+G8tUvGV2nrrb9C+HxkqLvCzb9a6umZXtis27tJPlGOnu0LOWIactIyl9mO+XBe81442m1fOvi7G/i8dOrRqUpyi4zVr5UpRs76WOKtlQS1nJvssl8yd/GumxDG05ZrP1bN6cLpL5yMzNKjgMqqJyvmUUtN1pxl+huNljP6k4/Bvd2EXBDNMlyAAIAAQOpQ9SPuRyzqUX6Efcg3RcXCZDH46OCZ6MU077zAZaKlq0jcONvGti/4c/ws4B38X/Dn+F/Q4B5tb130/AAHF1AAB1vBfZnleOpUmvRTzz/DHV/ovifaErHiv2abMyUKmJktajyR/DHf1f0PbAAAAAAHM8IdlrGYSpRfrNXg+U1qv+dp8VnBxbi1ZptNPg0ffT5X+0HZHiMX46K/o6/pe6a9Zfr8WB5QAAAAAPX0/wCq4f8AAvojyB6+kv8ApaH4I/Q3T159f8/1jAB6HMAAyJFyCSibggAWBAAsTGTTum01uadmioA2KuLqTVp1JyXJybRjuUFyC9xcrcXKi1wVuLjKrArcXGUWYzPdd2K3FxlUi5BFxkS2RcgDKFyABlS5ABAIJIAAAuUDoUn6Efcjnm3SrwypZldLjoHSjZiybmOM09zT+KLah0cSb1ZtUZLKtd281DZpUo5U3rc6V9ee/jUx7vGpb2X9Dz56DHRtCov7rPPnl1vXo0vAAHF1b+Bq4SK/p6NWb5wqKK/0s6NPF7J+1hMR/jJ9x58AfRsF4fYKhShSp4evGEEoxXoOy9+Y2F+0fCfdV+kP5j5iAPqC/aLg/u635Y95P/qHgvZrflXefLgB9S/9Q8F7Nb8q7x/6h4L2a35V3ny0AfUX+0TB+xW/LHvNDanhls3F01Tr4evOKeZL0VZ87qR89AHqp4/Yj3YPEJ/j/wD7NSri9lfZwmI/xkv0ZwABu46rhpfwKVSH46il+iNIAAeupf1ah+CP0PInrqX9Wofgj9DdPXn1/wA/1QAHpcwAAAAESCABJJAAm4uQAqwIuLgSLkXFwibi5AYVNxc2cfh4U5pU6iqRcVK64cGm9z1T3GsSJyTGC4IBRIuQLgTci5ACJIAAEAAAAAAAUMEt5nKvxb4tGogzhhJUnwbMni4PdPqiY0Vdekma2yboQ8PzaRkhUjFWcrmviPWb56lEieSYzHa+KyyhNK7k00r8zkU9l1JOyy9TrSozyt5Xom7nP8dJbpNfE46uM/yddPOOmCWzKidvR6lobKqy3Zepk8dL2n1LRrzW6Ul8Wcc0dMWYZbKqp2eXqXpbFrT3Zfi33GTymfty6sjyifty6suaf2n8/wClamxKsd7gv/L/AGL4fwfr1E3HJZcW33EPETe+cn8WXhjKsVZVJpclJjNFiLfq78F8Tzp/mfcR+7OI50/zPuI8vrfe1PzMjy+r97P8zJ/Bf5LPwZxHOn+Z9xH7tYjnT/M+4jy6t97P8zHl1b72f5mP4naf3bxHOH5n3Efu5X5w/M+4eW1fvZ/mY8tq/ez/ADMZqdp/drEc6f5n3FP3fr84fmfcW8tq/ez/ADMmWIrL1p1FdXV3JXXNdgzQ7V/d2vv9D83+xXzDW5w/M+4v5ZV+8n+Zjyyp95P8zE7fxJ3/AIQ8HMQ92T8z7j0EMDNUacHa8YpPXsOB5dW+9n+ZnptnzcqFNtttxV297N0w53rM/wDpr+b6nZ1Hm6p2dTpxLNnVzw5Xm6p2dR5uqdnU6qYuZ3Sy5Xm6p2dQ9nVFy6nWMVSd/cSb4hi99sOYsFPs6k+Qz7OpvkszF7S48tnO8hn2dR5DPs6nRJLFplOWzm+RT7OpPkM+zqb5LJF5Xls0FgKnZ1IeBn2dToKdiJNstbzLXN05/kcuzqPJJdnU3LkXEXnPbPLZq+RT7OpHkc+zqb1OfBkyZvL01mJjLnrDS7Oo8ml2GzCe/wB5MJekixOWqxmMtKdNreVsZajzSfvKqSb3By3Sqo+5e8Om99tCa1nZJ9ScsoxVr/A1hcsLYzFpVeaT+pX0Xzj80XBEylalcyLNZU3dO+mhhExhYnLNwvwIi7uwr6WjyXzK0tLsTBHaXJEeMRjbLRoyauloMNLKa5mSML7pR6ms6bXAqTBhvPDS5J+5mvUw87+qzFGbW5tGSOLqL7T+JrEJiWNxa4MqbSx0uKi/ei7rwlFtwWnI1jpcz+wr46kkvRzNcyHjrerFL4Gok724mRUJe73kzMpiP1NfFzlFpvSzNnwaoUp0sZ42ySjQtLIpuN6nBdu41KlOKi/Su7Pcc+FacVJRnKKlbMk2lK2qvzPNr+xl30sY6e02t4OUJ1MRV9ONo15WhZQTpwhl0tpvZTF+CuFjGqoeOzxjWytzTTlGMWtLf3rHlHtGu1JOvVtJtyXjJeldWd9ddCJbQrO969V3ve9SWt7J8eNl0ODs9TW8F8PBpyVRJUq8pxjUv6VNwslJxW9SfA1treD9ClSxLpKpKpRk280sqhDSzXo2mtXfW5wJ7Srz9avVejWtSW5pJrf2LoJ7RrSg4SrVJQk7uLnJpvtuwPSYjZ1KrTws60nGMcJhI6PL67leXqu9rbuLe81MNsCnKNHO5JyqU4yyyveE4TkpaxWV+itLy36nFjj6yUUq1RKMXGNpyVo8lZ7iFjaqUUqtRKPqrPK0fcr6AdF4ChLC+NiqkZOlUqq800lGsoZWsqvdPf2GxShSlgqKlGajGjia8lGcVnlCplV3lfP4I4Sryy5VOWWzjbM7Wbu1blfUKtNKylK1nG13azd2vc3qB3/MdHOo3qWjPJN5o/0l8PKrmhpprG3HRmptHZ9GNGE6WeLlOinnkpJKpSVTglu3dpzfK6norxk/QTUfTl6Kas0uWmhWVaUlllKTWmjba0Vl0WnuA9hU8FsJ4+NFTqKVpqTzJ2tFSVR3VktbW7UYY+DWHlTlZVoTy4pxlOSyw8VUyrMrcV2nm57RrySjKvVcVHLZ1JWy8t+42MTtutVw8KEpyyxz5nnk3VzO/p87Ad7E+DmCp1Jpzq2pUq06kVJOUvF5XmTcbK6b07UZ57CoVVGrWq1XDxOHUc0taakpNbo6paJL36nkqm0K80lKtUklFw1nJ+i98d+52XQU9o14O8a1SLyqGk5L0Vujv3ID09DwWw04Ukp1FOUcNOU8ycWqma6iraerp70YYbDwksPUrPxtHWpCKnK+Rwje7WXW/wANDznltWyXjalllss8rLL6vHhw5GSe1MRLPevVedWnepL0lyeoGmj1mzH/ANPS/Cjyh6bZs/6Cn+FHTT9Yv46EZkuRhuSpHWbPPLNGROYweNsys6jZM9ONtSIZKlS+nAi5juEzhMTLz2mbTmWQkpcXN4iKsLXJZjuTczXAlsZiobNLMDkTmMdwK9EQs2QUcib2LDeEMiUr8dSkplW7lhqOl1O109Cs6qWiIlPSzKuNtVqdHWLzEYUk7K3FkR01KrUrmu7IsEQnJcRlKO5uwqyssqMam0a/Wmd1Yv1o/FEKlGXqS15MxuSe/QmMLekteXvL7IidOS4FaK1u9y1JVSS4mWVWOVKS1ersX9GvKV3fmXm7RS+JaNKG9S05MwVJXZJlqBK7M2JlZKK4EYaOrk9yMNWeaTZfw/Uqs1xLePvvSZhBGsM94PmiHRT9WS+JhBehldCS4X9wmnGKVt+pSnJ3STZmq4l3to1u3Go8Sc5YVO079pbEXzdm9GG5mfpU+2P0JHhMY7YjJh6Tb9FL4mOnBydopt9h1sFs7K803/4r9TP+rOcdNCjhJVJNJcdXwR2MLg6dNbk3xbRm0SslZdgiyNR1Cs4xv6seiCpRa9WPRFagpTswv4rOlFfZj0RVwj7MeiMlZmOL1JMLCFST+yuiLeIjfVLoiak+hSMupcQZlapTjbSMeiMHi17KS9yMs6umhSWq9wxCRMqNR4RT+CIUV7K6Iq6sfaRV4iHP5EXLadNON1GPRGNQXJdEUhjIpPRh4hNXSfUuGdzPFQV/Rj0RjcI70lb3ItCtFpXW8xSxGV6ITBvj6zwoxavaPRFZSUdPkVjXutF8OJidS71JbGHK2r+QzeOfDQlTvvMBaJxnLjMzLYiTIx0pb29xLqXLiMOcwuSmY7oGZ6MMt9Q2Y02S2WPEmFrk5jGmWZYjowtcgxkuaJVcJzEPtKOQbuu1CFwlzK5yjZUsRiVws2VbsQ5cirXbryNRDUQvmtvKSk1qmIu+j68ilTTT/jNY6agnUuSnlXb9CF6Or3/QxN3ZYbTJlbhkIfqrJXdhOeum5biZaK3Hj3FIq7sjUjPTlxlqkUklJ3T1fBlastMq3L5ikuL3Is/CPqkk09SBOV3cy4aH2nuRlper6EFHi9WaperPM2zGakgAIMqkAg1+DLS0u+RjuXnokupjZqfCBK+46OCwEt8/RTW7izdw+GhSXorXm95lcjGWsIpUowVoq31MsdxhciYN8CLMJqPUrBvgUm3fUtTlvH6fhUbvqKe8irIpHeP0/GxU1XaazZabsyK0ko5hJHQnfQw1aqhvevJbzUqY1/Z07eJR+mr/AGuPaEmzYqYhyjmhpb1lx95qqbve+pFObi7rfy5mTERSaa0vvjyL/bnmfJJxTWaPxXL/AGMZMJuLui8oprNH4rl/sT1PGdRVkrcDApWehZOWW99DFctmaw3I3cbxMGfgzNSkkk29xiUXUm7fF8hPiQ2KMdLr4spN5p6EJNaJ2S3donJRWX7T3vl2CY6Zx2lJrdquzcXunoYYyy7t5khJWu1rzRzJhlc7WSIk97+BiirvRicmtH8ys4WzFlIx5l2k6czntMMsZu5aczDHfwLT/wCaliJwmGRy0QnNmOT9FCUlzRqI6DMWcjFdcxmViRBhOYtmtqYsze4lPg2IhrC09NVuKPt0LRnw6GJxfErUQlz5Ebt5VzS3BRb1e4NRCXLNot/1L5smj1/QxSit8S8ZKej0lwfM3C4Y6i471zKXL6wdn05hxvrHpyLjPgo2W9X3/QN5ff8AQx3HipZMJ2ZbSK11b4FVHN6u/kXGJF401J+j0ZWs7ejyJm8qst/EoqnBq5Z7EQi5OxlxE0koLct5kUFCOnrPhxRpt8yYwsTlJVk3KsS0kEAgm5amtb8iiLzdlbqajwVcrsgAfivQyZFyrZVsw6LXLwkrGK5ek9bFhJ8RUldlbkT3srczKwtmKp6kXIlUyW5iEmcGJxGXheXBGoqji3Kbu3wIxlb0tOpqXuaxhyzmFm9TJRi27rRLe3uMtLCpLNVdlwXFlcTfS3qcLbi4TOeoZJSVm4b+L7jWinJ2WrZfDwlJ6aJb3wRsTqKzVPfxfP3BnOOoYqmHaV007b0uBijJp3TEJuLunqZsqqax0lxjz9xP8O49Wi1KLS0k+HP3GOMbavobEYKK1Rgz53aW/g1+pZhIRFOcrIyVZqKyR+L5kzkqayxd297/AEMdGF7yl6q+b5GV/tmoSyLM9W9y/UxvLJ8mI1XKTbML3lnxIjtkcWi3jLpJmONRriWzxe9dDDS8d+jLSnJMrTguDFXMjWOmJjtPjOaX0JbVrrTXmYvGF5SVo6cLmTayUlfjxJmlz+RSM0op9rKTqFxEQmMs7tl3/Ix3XN9CiqLLYjMiLtXcl29SfGckjFnXIt412LBtX1TuveJpJ7+1FFeS04Fsqy6vVcuRcGFJT10GVvf8yPGJbl8SkptmW2S6ju1ZSU295S4uUWjKxaS4oxl4aavcBlhJTWWW/gzFKLiyam70d3zLUp5vRl8HyNDCy/q6vfy5FqsfF9r58DCk2+bLPR6at9pk0gv730DagtNZfQwtmWl3O/rdeJnoUdHPfbcjA6UrXtoVp1ZRd07Gv9THwnNt3e8nxl/WXxNjNCrv9GXPgzXrUJQeu7mXuDpDhxWqKBNovmT39R1KsYLShbtIirkwq0FZXfwKNkyl0Ks1OMEBMUVMkdESIWXbbIJcbbyN5na3lDZME3qY2zJTlwJBPhNX96+ZjhFsyTsnds1cViHKDdN7vWtvLjLO7CcRi409I6y+SOdOo5O7d2Yzdo4OyzVXljy4ssQza31hpUZVHZdeCNnNCj6vpT58vcYq+M0ywWWPYayTb5sMYmfV6lVyd27m1ho2jeppB8Hx9xWNONJZp6y4R7zXrV3N3fTkXz1Peo8bOKk2lk0p8l+pqxlYtRrOPauKMk6Sks0PiiT21EY6LqfYyVHJq9/A19xOYyrfVRzVpvXg1+pSVPxSu9W9z4FqMko597S3GssRJNvffenuNzhyiJ/FXIzOanGMb5bcODK2hPc8kuT3P4lKlNx3qxnxrqUyi470VuTCo1u3cnuLZovenF9mq6EwqgLODuktXa9kVJgXgZJ6R95ipvUvXnrbkX8T9WpaJtq5NWor2tu0JitIrmzXlK92Pw/WxOUcsNObKzlForXfqrlFGK4kiGVOPJjPHkYgRWXxq4JE58yZhL0nqBkw0vStz0Kp5Za+5mO9n7jNiVdqS4osIxTVm0VuXl6UU1vWjK5ebsJqQi5KT/3GZLcr+8rKTe8eKvmS3av5FXK+8hK7sXbUdN7+hfRNPTV6ItN2Xo7mYW2zLTWVPN0LE/BMKqtllu+hWU0tI9TFIiKbdkSVwXM9OmorNP4IlRjT1esuRr1Kjk7sHrJLESbvw5Cynu0fIwi4XCWmt5no4pxVpelHkyiqKWkupSpTcfdzKe+tmeHjNXpv3xNWUWnZiM3F3TszajXjU0qLX2i+p3DUUrEud0Za+FcdVquZrjMw1GJSQwErj8VaCEmG+BfD0sz13LeXzpHbvmXajDdpnnVtystfQ6PvLPb1a97Q/K+88/LDtsl6V07rNuEZpRujzXn6vf7HR95jqbZrS9ldiWhrlqzNLOtj8ReVlovqYMM5KSyq5yZY6b326GWG1qsVZZV8By1SdO2HenOlSd1G8+XBGjWrym7tnKeOm+XQeWz7OgnWqkaMw6tGlKbskbMqkaKtHWfF8jkLa9VRypQS7E+8w+XS5R+feOasJOlaffHRlNt3buxc5vl0+UejHls+zoTlq3xy6RaE2ndM5fl0+zoPLp9nQnLU45dvNGpv0lz4MxVKbi9Tk+XT7OhlW1atrei12ovLVnis7EXa1ijcZb9HzOR5yqdnQr5fPs6CdaspGlZ1p02vcWp15R03rk9UcmO06i3W6ES2jN8I9CctV4pdq0JRcknFr4owHN85VMuX0bXvuK+Xz7OgnVqkaVnVTLqs+Ovv1OQtoT5R6MjzhPlHoOWF4pdyk4t3y2tyZWyk9JdUcdbSqJW9HoRHaNRO/o9By1Ths79XR71pF8eJqnLntKpJ3eXoV84T7OgnVqRpWdvEJ5no9yMVuxnL851ea+ZPnSrz+pZ1ayRpWdOz5EqL5M5fnSrzXzI851Oa+ZOShxWdbxcuRKjZ70vicfzhP+70I84T/u9By1OKzt1VFPf0ReMlKGVcNxw3tKo/Z6CG06kd2XoOWpxWdS5By/OM+UejLedKnDKvchy1Xis6ni3x095Po7tfech7QqdnQjy+fZ0Ly1OKzqveZvF5ld6NfM462pU5R6Eec6l76dBy1OKzreMS9XqY3K5y3tCf93oFtCaf2ehOWpxS69Ok5di5l5VYwVodTkT2rVlvy27EY/Lp9nQvLU4rOm3cg5vl0+zoPLp9nQnLVeOXSIOd5dPs6Dy2fZ0HLVeOXRMlOs1pvXI5Xls+zoPLZ9nQvLU45deVJNXh0Me73nOhtCpF3VugqbRnJ3aj0LzVTjs6lHEyi+a5Gw6cKvq+jLkcHy2fZ0Cx01y6Dmqk6U/jqVKUouzRG40/PFVqzyv3owPHTfLoXmosadv104QcnZGevNRWSPxZyKe0qkb2y69hR46fZ0JzVOOzWAB5HoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH/9k=",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('e3tGQykFC5M', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "79cfdd4f-a214-4c13-aba8-c40067c1faf5",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f18e5709-a21d-4fc0-913b-fa1ad90beec1",
+ "metadata": {},
+ "source": [
+ "By the end of this module, you will be able to: \n",
+ "- Understand decision trees\n",
+ " - Define what a decision tree is and explain its role as a supervised machine learning model. \n",
+ "- Prepare and explore Data \n",
+ " - Load, inspect, and preprocess datasets using Python libraries. \n",
+ " - Separate data into features and labels, understanding the significance of each. \n",
+ "- Train and visualize models\n",
+ " - Create and train a decision tree model using scikit-learn library. \n",
+ " - Visualize the structure of a trained Decision Tree and interpret its decision-making process. \n",
+ "- Make predictions and evaluate performance \n",
+ " - Use the trained decision tree model to make predictions on new data. \n",
+ " - Compare predicted values to actual values using visual tools and calculate metrics such as Root Mean Square Error (RMSE) to assess model accuracy. \n",
+ "- Apply to Real-World Data\n",
+ " - Implement decision trees on real-world datasets, such as COVID-19 data from California. \n",
+ " - Understand the pratical implications and limitations of using decision trees for predictive modeling. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "17ef06ff-a9e9-4d29-8bd5-dc7c3fbf994f",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhwaGRoeHRofIjAlIiIiIzAlJScyLikyMC0tLS81PFBCNzhLPSstRWFFS1NWW11bNUFlbWRYbFBZW1cBERISGBYZLRoaL1c2LTlXV1hXV1dXXldXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXXVdXV1ddV1dXV1dXXf/AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAgMBBAYHBf/EAEoQAAIBAgIDCQsLAgYCAwEAAAABAgMRBCESMVEFE0FSYXGRktEUFSIyNFNzgaGx0gYHFhcjM0JyssHwYuE1Q1SCk6Ik8WOD4kT/xAAZAQEBAQEBAQAAAAAAAAAAAAAAAQIDBAX/xAAgEQEBAQEAAQQDAQAAAAAAAAAAEQECEgMhMUETImFR/9oADAMBAAIRAxEAPwDz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAF/cr2x6R3K9sekFUAv7le2PSO5Xtj0gqgF/cr2x6R3K9sekFUAv7le2PSO5Xtj0gqgF/cr2x6R3K9sekFUAvWEk3ZNNvUi3vXW4kurLsEK0wbneutxJdWXYO9dbiS6suwFaYNzvXW4kurLsHeutxJdWXYCtMG53rrcSXVl2DvXW4kurLsBWmDc711uJLqy7B3rrcSXVl2CJWmDc711uJLqy7B3rrcSXVl2CLWmDc711uJLqy7B3rrcSXVl2CFaYNzvXW4kurLsHeutxJdWXYIVpg3O9dbiS6suwd663El1ZdgStMG53rrcSXVl2DvXW4kurLsBWmDc711uJLqy7B3rrcSXVl2ArTBud663El1Zdg711uJLqy7AtaYNzvXW83Lqy7B3rrebl1ZdgStMG53rrebl1Zdg711vNy6suwFaYNzvXW83Lqy7B3rrebl1ZdgK0wbneut5uXVl2DvXW83Lqy7AVpg3O9dbzcurLsHeut5uXVl2ArTBud663m5dWXYO9dbzcurLsC1pg3u8+I81U6k+wd58R5qp/xz7AlaIN3vTX81PqS7BLcqurXpzV3ZXhJXexZawVpA3u9GI81U6k+wd6MR5qp1JdgK0Qb3ejEeaqdSfYQq7m1YK84uCeScoyivagtagL+5Xtj0juV7Y9IKoBf3K9sekdyvbHpBVAL+5Xtj0juV7Y9IKoAAAAAAAAAAAAAb24fluG9PT/AFo9llJ5Wz9djxvcJf8AnYX09P8AWj2icIXUW83mkybZ7Ch10rXvmr7UO6Y7X0M2XTiuGxiroQV5NpdPuLgo39Wvna1/bYx3THa+hmyoRsmnk9RCM4NXUsr2vcqKXiEs87bbMd0KyeduZ/zhLlKD1SXShpQ4y6UBXOqkk883YzOrZNu+TsTU4cZbNaMaUOMulE0Vd0KyedmY7pjtfQy5Tg7Wks9WaMqUON7UMFLxCT4TCxKfBLoL1KHGXSgpQ43tRRVCsnt12I91R/q6GXtwvr9phzgnZvPnAhKqlFyzsunoLYtaLk72RHfIcZdJKNeCyuAjVg1dXtexlVYNXzy5GFiobVt1me6I7V0ky/aoutBXvdMzGrB6r58jRlYmO1dJjuqG1dIEd/ha+jLoM79DZIk8THatdtYeIiuFdIEsrXs9VyDacVJXz25Eo14vVYhOtB5NrpAqhWUnJK94uzya4Lk7jShlnr1co0oXtfPnKhcXJqKIzcYq7bQEW3tM3JNRWtmLxtpXy2gRu9ole2RZTUZK6d0QhOnKegm9KzdrNanZ2ds8wMRvbPWYbd9eRRHdGi628py0r2vbwW1rVzdnTVnr1Mambm/DXhjE4uSzSLI4pPh4LmrufH7BHyp1Lpu9nOcoq7skk7N9B4+vU6zMXya3yk3enCMKNHyirK6fFjeyfO/2PiY/cHEaO/SrVJz1tuTbT2ouowlit13ZLwF4GlqsoK3tZ2U1V0N7tDSaavZ6N7dNj0+lx+t08nJfJH5S13Wjhq7lUi9UnnKHO+Fc53E3ot8p5jupTq4DHJw0d88ZKK8F3y0bbGelRrxrUoSX4oqXSrnP1M8fdrNqdKfi8qb9xyfzlTvhKa2Vl+iR0lOp4TvwQa9Zyfziu1Cmr3vOL/6zRjju7kNefgA9KAAAAAAAAAAAAAAAAAAA39wfLsL6en+tHtjpxclJx8JameJ7geX4X09P9aPZMRTq6UnG7WVkpaPOud7TPXXjlXMrakr60RqU1K10/U2n7DXSaoNVm1e6dndpN5K/M9ZG1Pe4y05aPi34cr8nOXNvum/LbayKqeFhGOjGKUb3stV3mUKnT3pNzm4Phu88mu0zvdJqU1OVm87P+3KUXdyx2B4WD1xNejSoytGFSba5WzEt4bbc/CbzsunLOwF6wcLW0VYRwVNaopMppYejo6UZSai87N/+yVCFJyWhK8ksr55cOz2gWPBQvfR1K3IFgoXvo5lMI0ajSU3pWtldZW1P1Et5pp23yd/zf25ALu5IcVDuWPFRToUmlFVJeNfXm75EqUaalGSnJu3K76wLe542tYPDp61fn/nII1YaWt3l0bC8DXWFjxUZ7mjsLwBQ8NHiodzR2F4A1+5Y8VBYWPFRsACjuaPFXQO5Y7EXgCjuaOxGFhI2too2ABrrCx2GVhY8VF4ArVO2ojOipK0ldcpnExm6clTajO3gt6k+hlGHjiFTSnKDqXd3wWv4OpLg5EBdKimrPUHRWjo2ytaxFqtwOn0PtJ1tPQloJOpbK7sr8vIBmnTUVZLIhDDxU9NJ6Vmr3b1u7yvtIx37RjfQUs9JZtcli2lp2eno34NG9tXKBpR3JpKtvtnfScrX8FN62b09rWpP+ew1cOsRvM98cd9z0dHVqy9t/VY+TuPTxCqT0980NB6WnfxrcHtNe+44Xn0+s5zPlfh8UtB21a17z4irOp4KWVrdGb9Z86jurVVCFqWWgs9L+lP3MhTx9aENKFOF8nF6XJrfWt6j5/h3uSNdXdjGHx8MHjKVaSavffOZpWkl/NR2+/Rtvu+pw1pHmeJwmJm3viUmlZPSVkln0Zo+tD5QTw2GjSnRTqaNotu6tmrvoPf6Xtk1rM3Mae62Mjit0lJ1N7p6cY75e2ik82nwZ8J3zqQhvdGm1mlaz1R29B5TON4tvXrNnc7detQknCbairJPNW2ciOPrc738NY9YhFZtauw4n5wb71T5Kn7SNjc/dluN03aebT4HwnxvldinUpwvq0r+x9pyzrN6zMTyzXKgA9TQAAAAAAAAAAAAAAAAAAPobgeX4X09P9aPbdJXtfPYeJbgeX4X09P9aPa97SblbN8ICcrX1vK+ohTqqTtaS4c42RYlm82VzU7O1r3yu8rcHBrAzOrorVJq9skTg7q+a5GUNVv6Olh79n4nSwNjRFjXkq22F9mexW/cnRjU/Hb/AGtgW2CiHHlY0eVgNEi6Udi6ES0eVjR5WBHeY7F0IbzHYuhEtHlY0eVgYVKOxdCJW5TGjysaPKwM25RblMaPKxo8rAzblFuUxo8rGjysDNuUW5TGjysaPKwM25RblMaPKxo8rAzblFuUxo8rGjysDNuUW5TGjysaPKwM25RblMaPKxo8rApliEuCfqjcnGpdXtLoz6A5eFa+Vtd0QdTJvO99Wks+UC2ckk23ZLNsq7oyT0Z5q9rZkJVpKcUleLWctJZa+Do6TMq9lJ55OyzzfLbYBbCrpX1q21WFV5SWfit3tkUyrtOmrNqd9J3yjle75ydaXgTs9UXw8gHhyxU7JXWXIuwjKtJu7fsRWAJabG+PaRAFjrzatfIiqj2kQBs0sfVgrRm0vURr4ypUVpycks+AoBPHLYkAAVQAAAAAAAAAAAAAAAAAAfQ3A8vwvp6f60e172k5Su89a4DxTcDy/C+np/rR7Uoy0m3LwdlgMuaTd2lq1lc1StaUlrvnLhJ1FtjpL1fuVuEXrpfpAreHo28a3A3pZ35egk6NGTtdXWVtLZ/PYTcU73pa3d6u0aKvfes9uXaBX3PRs/CTX5tXN7RvFF8PBxv5sZPQj5n2R4P/AGNCPmtX5QKo0KMbrT15Zyvs7DLoUUr6WW3S/m1Fm9xvfec9vgmN6j5n3dvIgKU8Pm9NZq2szBYe91PNW/E3bUk7dBZvEPMLoiSjTitVFL1R5/2QFFGnQnL7OV3bUpP+cJesFC3D0inCMHeNHRfJZfuWb4+JLpXaBCeDhLXpdNuBL9kYeChnrz2Nlm+PiS6V2jfHxJdK7QIPBwe3VbW/5wiWCptttO72OxPfHxJdK7Rvj4kuldoE4QUVZEirfHxJdK7Rvj4kuldoFoKt8fEl0rtG+PiPpXaBaCrfXxH0rtG+viPpXaBaCrfXxH0rtG+viPpXaBaCrfHxJdK7Rvj4kvZ2gWgq318SXs7Rvr4kvZ2gWWFlsK99fEl7O0b6+JL2doFlhZbCvfXxJeztG+viS9naBZZFeIX2c/yv3DfHxH7O0hWm3CS0H4r2bOcDwgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtKlLSaa8Hgf8AGeLbgeX4X09P9aPbnqAor4nRhpaLeeja1m3ey17XZes1++as2oS/p/qyWj06SN6Uktbtd29b1FKxtLzkemz6AK1jfBlLRVk0l4Wu8dJXustaJQxkW6at499WpbL89mSWMpWupxe22dr5Z2JQxNOXizi9SyafMBqR3TvbwFqv43Ins5c9hKW6STtor1SvwJ5ZG061PjR6eWxjumnx49KA1obpJtJwyds07paTSSex5/y5ivui6bd6astLVJ/ht/Tl4xtd0U+NHpDxNPjx6QKIY1ybSgna2qT4basuU3SnuimvxRyy1oz3RC19ONucC0FPdNPjrp2ax3TT46y15gXAqWIg9U49Jnf4cZZ8v82gWAp7qp8ePSHiaa1zj0gXAp7qp8ePSZeIhn4Sy15gWmCrumnx49Jl4iHHj0gWAq7pp2vpxtzme6IcePSBYCt4iC1zivWO6IXtpK97dOoCwFfdEOPHpMLFU2r6cekC4GE7q6MgAAAAAAhW8SXM/cTIVvFlzP3AeBgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+huB5fhfT0/wBaPbnqPEdwPL8L6en+tHtz1AQrXtlDTd9WS9eZrb3rXc8NX9O21tRulb07ZaN9L2X94GtaSvbDxz15xMqMksqEE2881wPX7yVqy/FDk5yVqtneUVssuXh9VwKpRdvJ4PZmtt9n8uS0W1nQjzXXYTg6nDKm808r6uEjavwSp29ewDFpWadCNuDNWeez2kfC/wBPHpiWVFVtlKCd/U1bhy13uVR7pd05UU9iTb19gGzCjHW4RUms8l6zKowz8GOfJsNeEcRfw509FcVO+rhvwE2q3BKn0MC1UIL8EehDeIcSOfIip79d2lTtwXuPtuF09XLr4ALlRjxY9CMKjBaoxy5EVPfbeNT9ttfYPtrK0qfDfXyW/cC3eIXvoRvzIbxDiR6EVQVbhlDVwbf5YxHfdK2nTyTulr1Zeq4F28Q4kehGd6jn4Kz15aym1a2unfPgfqGlUTznD3bLfzlAtVGHEj0IKhBK2jG1raiiE6t7adN+97cuknarneUErNLneoCzeIWtoRtzIbxDiR6EV/a2ylB58urtI2r210+hgXOjB64xfqRneo69GN1qyRX9pfNwtb2291yM99t40E7+q1gLVRgtUYr1IbzDix6EVPfeNT1Z6zM1V/DKF7cKy1cFgL0jJhas9ZkAAAAAAEK3iy5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gMlOKS0fCi5q+pe8uKpVJWuoNu9rXSutoGtTVO6tSmtFK2T4NXCVwhSurUp+tPK/rNqdapwU7/wC63CI1pu/2bWzPXmBRvdJxzpTyy1Zrh26szCVJtN0pq/g5p8Pg7dhsurU8K1PUrq8tbtqKZYuslfueT5pICuCpa1Rnmtj4ctvIJwpvXSqZbFr4Npb3TW8w9fHVydOtVabdLRd8lpK+rNgUyp07NbzPK+rhvLnM6NN6UnSnw3y13eeVy6FabdnTaW26Y3+pb7p6+MgNZxpeaneztk87Ln5DOjTsvsqng5cN/fnrNmVWd3anln+JZ7DCrT80+stgFChTjkqU7a9T7eUxamn91O715PbfaXuvU4KT9cltMyr1OCk3lfxl0AUb3Sye9Tz1ZN29uWsxWpUpq8qU2lz8F0nZPM2VVndfZvVm9JZZfxGN+qeatzyWy/vyA0+5aFrbzO2lqs9dtevkJyhSejejPJZZPKyNl15+ab/3LaFVqW+7/wCwFUdz6Mk5KDi5J3zafhKzMPcmhxPa+H1l8qs8rU75cZZchiFebavSaW26e3+wGZYOm7XjqVlm8kZ7khe+jnq1vZYiq1Sye9O+zSWRjuipe28vrIA8DSf4fayXckLt6Ou983w6zG/VL/dPn0kJVqi1Ur/7kBlYWHF2rpv2slSoRhfRVr6yDrVPNf8AYOtU81/2QF5kqozlK+lDR9dy0AAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/AFo9ueoDJTiWtHObguMidSejbJu7SyV9Zq93xazp1OZwAjeNr90Ste2seDr7olsefCSWMg07Up5Z5wtraWXSSjiYNP7OSUeBwtwPVt1AQlo6V9/kr6lfLX/EWrDSs06s3fhyy6Bv8MvAeq/i8/RqHda4s+gDCwjV7Vaivy395l4WWVqs1678N/3MxxcXbwZZu3i83aJYtL8M7fl5Wv2ALDvK9Sbty6+cx3K/OT6TPdSvbRn1f5sMd1rizvst++oB3K7332fSZ7nd775LXqEcUnfwZZX4NmzpEcWm7aM1/t2gYeHlZ/ayz9nMO5pedlazVuflM91q19GfVtt7B3WrX0Z8nggI4Zq/2k3zvVlYj3LLz0/YSWKVr6Mlle1uVL9wsWuLO1r3sBl0Hn9pLN9HMYeGfnJ9JmOKTV1GWu2oxDFp28GavtiAWGeik6km0734dVhLDyf+bNZLZwGFjE/wz2+KI4yL1Rn1QMxw8k098lk9T9zI9xvztRrgVyTxSsnozz/pMRxibS0Z5uy8H+WAzLDyf+ZJZW/uYeGlwVJmY4pP8Ml6hLFJa4ytyK/BcBGhK6bqS5uAx3K8/tZ3ate/ttqJSxNvwyte10ufsMTxSWuMujm7QLzJCFTSbVnltRMAAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gIVlK3gyUXy8xSlVaf2kM1k0tXKVbtY+jhqO+V4ylBSStFXd3q4T4K+V+5/mqud/wLhVn+IDpLVM/tI5rZqdrZevMR3yz8OF3aztl0HNfS7c/zVXh/AuHX+LkMv5YYC997rareKtXWKOjW+PVUhda8h9pb7yGvZyHOr5Y4Bf5dbWn4q4NX4iK+Vu561Uq3V//AEB0l6jirVIatdrp/wAVhLTu/tYrPJZcuXu6DnH8r9z2rOnWtnlorh1/iH0u3P8AN1uqviA6TRqvVOPR0jQrcePQc/T+W2CirRhWS/IviJfTvB8Wt1F8QHQShU0m1NWvqaI6NbjR6Og+D9OsHxa3UXxD6dYPi1uoviIOgjGrwyjrXBwXz/YxCFVPOcWubM+B9OsHxa3UXxD6dYPi1uoviA+7oV+PDqmdCtbx43vrtwZ+3UfB+nWD4tbqL4h9OsHxa3UXxAfecK1/HjbP8PQZcat3aUbcGWZ8D6dYPi1uoviH06wfFrdRfEB99Rq5+HG/BkFCrn4cb3y8HgPgfTrB8Wt1F8Q+nWD4tbqr4gPvONbK046lfLpMuFXK01y3XLn7D4H06wfFrdVfEPp1g+LW6i+ID7uhW48ej3k1Crl4a155HP8A06wfFrdRfEPp3g+LW6i+ID7yp1rfeLq8j1+uxbRUkvDab2pWOc+neD4tbqL4h9O8Hxa3UXxAdOD5W43ygo41zVFTTgk3pJLXe1s3sZ9UAAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gOc+Xv8Ah79JD3nnB6R8vPIH6SHvPNy4gACoAAAAABfgcLv1TQ0lHwZScmr2UYuTy5kUG1ubiI0qulO+i4Ti9FXa04Sjeza2gSnubNqDpPfo1G1Fxi07xtpJxerWnzEIbnV5SlBUajlDxlou6vqN/C7rU6MVRjGTpOM1KUoxcnKejnoO6stBK187sxX3RpVae9VHU0YzjOEoQhF2jDR0dFO0VsedijTludV1whOaUIzbUWraSvwmXuZVc3GnCdS0YydoNW0oppZ8/rN1bq0ZJKpGcoKnCLpuMXFyhT0dJSveLvwrg4DNbdSjVSjNVIqM6dROKi23CnGDTu1bxbp8uog+bHAVnTdRUpuCveWi7ZOz6LO5ZgcFGspt1lBwhKck4OXgxtd3XPqN2W7UZVaVSUJLR39yStb7ZzatzaSvzHz8DiFTVZNN75QnTVuBytZvkyAnLcutdaNKcoyzhLQa0la6dtlsyENzq8pSiqNRyh4y0XdchvU91oKpJuL0ZYaFFtxjNpwUc9GWTV46uUuju3FrRk5eDUjOE94pSeUFFLRbtG2irNMK+ECU5Xk29bbb9ZEIAAAAAAAA7D5ufvcV+Wn75ndnCfNz97ivy0/fM7smtAAIAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/AFo9ueoDnPl55A/SR955wej/AC9/w9+kj7zzguJoACoAAAAAAAAA3dyKcZYiCkk1aWT1ZRbPr7zDiQ6qDn16mc7Nc2DraG5e+RvCEG29Wilqtw2twko7jt5b3C/CrRLE/Ln+OQB1tTcpxi5OlCyV9Uf5wFNbCRg0nCGaT8VcKvsEPy45gHSww9N3ThHxZfhXBFs5pEa47zr4AAGwAAAAAAAHYfNz97ify0/fM7s4T5ufvcV+Wn75ndk1cAARQAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD6G4Hl+F9PT/Wj256jxHcDy/C+np/rR7c9QHOfL3/AA9+kh7zzg9H+Xv+Hv0kfeecGsTQABAAAACWS4OAsEQZutntCa2e0QX4HE71VjO17Xy501+5v99ocWXsNfHYqlOnCMIWktf86DSus8vaTna59cZ18vrLdmOye3X/AHJR3Yi3+Nev+58e62e0RklbI0n4uX1nuzHhU+ldoe7EHrjN9HafJussvaYutgPxcvrrdiCvaEr2a4OFNfufHLsPUhGac4aUeFXtfLaQlJXbtr1chPuO3PpZzx5ZqAJXWWXtMJrZ7REYBl2t0mCAAAAAA7D5ufvcV+Wn75ndnCfNz97ify0/fM7smrgACKAAAQreJLmfuJkK3iS5n7gPAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAfQ3A8vwvp6f60e3PUeI7geX4X09P9aPbnqA535eeQP0kfeebno/y88gfpI+884NYmgACPvUMHS3mi3CLcqak29rLaWApydlCC1vNZZK79xqUd06apU4NSvCCi7K97essjurTi7pzi1yWfvK8vXnW69xbNJ0Y5uydlbXYPcfJfYroWy/7mr39XnKv8d9pl7uJW+0q5r+cPIU/b+rXuXBad6cE4JNprbq95V3HS83HoIPdqGa0qmdr5a7ar5kO+lL+voXaE3z/rU3YoRpzhoJJOF3bbdr9jQNzdPFxrSg4p2jDRz52/3NMy9PFmUAAaAAAPtbn4WnLDxlKCcnKV2+S1j4p9LB7pRp0o05Rlk27q3Dbl5C4x6lns+jSwFOUoxVOF5NJZbWXy3EstJUoOPGyS1X4c9R87vtBWejUXCskv3Jd/Ftq7df9yuGef3X0XuHa63qLs8tW1r9iC3IjpaLpwT0XJXs7pc3DkzS79q171OnP38pHv3H/wCTVbWtWzWF/f8Aq/uSl5uPQaO6+HhCNJxiouWle3Jo295b32p8Wfs7TV3Rx0a0YRjFrQvr4b27CLx5+Xu0QAR6HYfNz97ify0/fM7s4T5ufvcT+Wn75ndk1cAARQAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD6G4Hl+F9PT/AFo9ueo8R3A8vwvp6f60e3PUBzny88gfpIe884PR/l7/AIe/SQ95zPyc3JoV6E51YuUlU0V4TWWinwc5rE1zxmKv7zsO8mCu1oO62zmllrs75hbi4F8C/wCSXaByeHqRhNSs3Zl+6OLjWqOSjo8isjpo7i4J6op3yyqPtJQ3AwctVN5O3jS4NfCT2tSONy5RdbHq4NpfulRjTxFWnHxYTlFcydka8Vdo1RZQqqE4y0VK2tSzTyI1JRlJu1ru9lqS2I7Wp8ncJFtb28v65dpr96MDZtwaS2ymnqvqvsJ9108t8fFyDtZ69Zg7F7kYFcC/5Jc20zHcXBS8WOk7Xspybt0luMRxoO0huDhJJSVN2auvClw+s5XdOhGliKtOPiwm4rhyTINUA7vE7k4Sm7dzxedkv7t8gRwhNNK2VzsFhcE7WoxzV9Tyvkr+szLCYLzMcsvFlwZirHN47dCNWMFoW0VbZ/P7GimssufPWdksDg3FSVCLTko3s1r5+DM+f8ptz6NKhTlTpqEnU0XbZotk59vZJHO6SyyyVzCa2e0wdJ8mNz6NahUlVpqclU0U3wLRT/c1R8KWITp6GhFeFfS/FzcxVk75WOxWCwVm3QistLU7tWvdWJvA4Jf5UVfLxZfzgfQyZHTvrepfpxIO1jgcE9VOGrS1NZbSnHbm4Z4OrVp0oxap6UXwrNcoYjPzc/e4n8tP3zO7OE+bn73Fflp++Z3ZNUABAAAAhW8SXM/cTIVvElzP3AeBgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+huB5fhfT0/1o9ueo8R3A8vwvp6f60e3PUBzny9/w9+kh7zmPk9uxRw9CVOq2pOppK0W1bRS4OY6j5eeQP0kfeeblxNddLdvAttuUs/6Z+u2y/DtMd+cDxpbfFn/ADZ0LYckCldct28DdO7utT0J8+zlLKfyhwcb2nPN3fgTeb50caBCtndGvGriKtSN9Gc5SV8nZu6NdPNMwAjsp/KnDSbdqmf9K7Sjv7g+JU6vJbbsOUAWur7/AGD4lTq/3M9/8Hn4NTNWfg610nJgQrsF8pcMlZKpl/T/AHOY3RxCq4irUimozm5K+uzZrAAdPU+VkZX0sPe+taat7jmAEdK/lRB68Nr/AKl8I+lMP9P/ANl8JzQItdL9KYf6fl8dfCaO7O7axVOEFT0NGeldyvwWtq5T5AKUPtbhbtQw1KcJwm9KeknG3FStm1sPigI6j6R4XzE+rDh18JlfKTDXvvE7/lh2nLAi11S+U2HTuqVRPV4se0pxvyiozw9SlCnUTnHRV1FJZrYzmwCuw+bn73Fflp++Z3Zwnzc/e4r8tP3zO7GgACKAAAQreJLmfuJkK3iS5n7gPAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb+4Hl+F9PT/Wj296jxDcDy7C+np/rR7e9QHOfLzyB+kh7zzg9H+XvkD9JD3nnBcTQAFQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdh83P3uK/LT98zuzhPm5+9xX5afvmd2TWgAEAAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADf3A8uwvp6f60e3vUeD4PESo1adWKTlTmppPVeLur8mR1P1jY3zWH6s/jA9Hx9OlKFq8IzhdZSjpK/BkaXe/AZf+PQz1fZLsOF+sbG+aw/Vn8Y+sbGeaw3Vn8YHddw7n5/YUMtf2S7OUytz8A9VChrt90uHVwchwn1i4zzWG6k/jH1jYzzWG6k/jA7lYLc/wD09Dl+yWXsM978Da6w1Fr0S4fVyHCfWLjPNYbqT+Mz9Y2M81hurP4wO6WB3Pa0t4oW270uTk5UYlgcAteHof8AEuw4b6xcZ5rDdSfxh/OLjH/lYbqT+MDu3udgFf8A8ehlr+yXYYWBwD1Yei//AKl2HC/WLjPNYbqT+MfWLjPNYbqT+MDuo4HAO/8A49HJXf2Sv7jHcW5/mKH/ABLsOG+sbGeaw3Un8Y+sXGeaw3Un8YHcVMJudHKVHDrK/wB0uTk5UYlhtzla9Ghnq+yWedthw6+cTGeZw3Un8Y+sTGa95w1/yT+MDuZYPAKeg8NR0s/8pcHq5CEaG5zv9hRSWu9JbUtnKcV9Y2M81hupP4zD+cTGeZw3Un8YHbbzubl9jQz1fZLsLnudgFrw9DXb7pcHqOD+sTGXvvOGv+SfxmfrGxnmsN1J/GB3TwO59r7xQte33S2X2GO4dz7N9z0LLX9ksvYcN9YuM81hupP4x9YuM81hupP4wO67gwFr9z0LXt90uD1CW5+AWvD0ddvulweo4X6xcZ5rDdSfxj6xsZ5rDdSfxgd08DgF/wDz0P8AiXYTp7mYGbajh6Dazf2Uew4L6xcZ5rDdSfxj6xsZ5rDdSfxgeg95MJ/paH/HHsHeTCf6Wh/xx7Dz76x8b5rD9Wfxj6x8b5rD9Wfxgej4XAUaLbpUqdNy16EVG9tV7c5snl/1j43zWH6s/jH1j43zWH6s/jA9QB5f9Y+N81h+rP4x9Y+N81h+rP4wPUAeX/WPjfNYfqz+MfWPjfNYfqz+MD1AhW8SXM/ceZfWPjfNYfqz+MxL5xcY01vWHzVvFn8YHIgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//Z",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('_kAjJ8rJwfU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "faaa7b6b-4339-4f7f-ac74-ce3c67306fe9",
+ "metadata": {},
+ "source": [
+ "We are going to be working with **COVID data** from the 58 counties of California during Summer 2020 (July, August, and September). \n",
+ "\n",
+ "**Remember the complete dataset with 58 counties from the previous video of this workshop?** \n",
+ "\n",
+ "Let's now imagine that we did not know the **cases per 100,000 people** for the last 18 counties of the dataset. \n",
+ "\n",
+ "![Features-for-Prediction.jpg](images/Features-for-Prediction.jpg)\n",
+ "\n",
+ "**The objective** of this exercise will be to make predictions for these missing values in the column **cases per 100,000 people** based solely on the data that we do have available.\n",
+ "\n",
+ "The information that we still have available for these 18 counties are:\n",
+ "\n",
+ "* Population\n",
+ "* Vaccination Percentage (Partial and Fully vaccinated)\n",
+ "* Unemployment Rates\n",
+ "* Partisan Voting Percentage (Democrat, Green, Republican, Libertarian, and Other)\n",
+ "\n",
+ "In order to do this, we will be creating a **DECISION TREE**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "347373f9-cd3b-409e-87f6-2237f7ab6a5c",
+ "metadata": {},
+ "source": [
+ "## WHAT IS A DECISION TREE?\n",
+ "\n",
+ " A **Decision Tree** is a supervised machine learning model that allows us to make predictions by learning simple decision rules that are inferred using available information in the dataset. \n",
+ " \n",
+ "- A Decision Tree is called a **supervised** model because we know exactly what we want to figure out. For example, for our Decision Tree, we will specify that we want to figure out the missing values of the column **cases per 100,000 people**, and our model will try to find these values by making predictions for them using the the information we do have available.\n",
+ "\n",
+ "- In contrast, in an **unsupervised** model, we do not know exactly what we want to predict. Instead, an unsupervised model finds hidden relationships between different types of information and can group them based on similarities. For example, Netflix surprising you with a new show you like.\n",
+ "\n",
+ "A **Decision Tree** can be pictured as a tree-like flowchart, where we start with a particular criteria and based on whether this is True (Y for Yes) or False (N for No), we chose only one of the branches. This process is then repeated at every decision until we reach the bottom of the tree, where we end up with a specific prediction. \n",
+ "\n",
+ "![General-Decision-Tree.png](images/General-Decision-Tree.png)\n",
+ "\n",
+ "We will see how a Decision Tree can help us predict the missing **cases per 100,000 people** in more detail later on in this tutorial.\n",
+ "\n",
+ "You can find more information about different ways to classify machine learning models here: [Machine Learning Models](https://www.geeksforgeeks.org/introduction-machine-learning/?ref=lbp)\n",
+ "\n",
+ "You can find more information about Decision Trees here: [Scikit-learn](https://scikit-learn.org/stable/modules/tree.html)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "d96c650c-4a96-486d-99db-16dba7ed9317",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/javascript": "var questionsMpQrXvjLDoZX=[\n {\n \"question\": \"Which of the following statements is the best description of the project you are working on here?\",\n \"type\": \"multiple_choice\",\n \"answers\": [\n {\n \"answer\": \"We are creating a model to better understand what determines the number of COVID cases in a county.\",\n \"correct\": false,\n \"feedback\": \"While we may gain insights into what is correlated with COVID cases in a county we will not know for certain what determines it. Please try again.\"\n },\n {\n \"answer\": \"We are creating a model to be able to predict the number of COVID cases in a county.\",\n \"correct\": true,\n \"feedback\": \"Correct. Given data such as population, vaccination rates, and other information, our Decision Tree will predict the number of COVID cases for that county.\"\n },\n {\n \"answer\": \"We are learning about the biology of COVID transmission.\",\n \"correct\": false,\n \"feedback\": \"There is no knowledge of disease transmission in this project. Please try again.\"\n },\n {\n \"answer\": \"We are trying to determine whether SF county had more or less COVID cases than LA county.\",\n \"correct\": false,\n \"feedback\": \"We can verify this from the datasets themselves but this is not our aim in this project. Please try again.\"\n }\n ]\n }\n];\n // Make a random ID\nfunction makeid(length) {\n var result = [];\n var characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';\n var charactersLength = characters.length;\n for (var i = 0; i < length; i++) {\n result.push(characters.charAt(Math.floor(Math.random() * charactersLength)));\n }\n return result.join('');\n}\n\n// Choose a random subset of an array. Can also be used to shuffle the array\nfunction getRandomSubarray(arr, size) {\n var shuffled = arr.slice(0), i = arr.length, temp, index;\n while (i--) {\n index = Math.floor((i + 1) * Math.random());\n temp = shuffled[index];\n shuffled[index] = shuffled[i];\n shuffled[i] = temp;\n }\n return shuffled.slice(0, size);\n}\n\nfunction printResponses(responsesContainer) {\n var responses=JSON.parse(responsesContainer.dataset.responses);\n var stringResponses='IMPORTANT!To preserve this answer sequence for submission, when you have finalized your answers:
Copy the text in this cell below \"Answer String\"
Double click on the cell directly below the Answer String, labeled \"Replace Me\"
Select the whole \"Replace Me\" text
Paste in your answer string and press shift-Enter.
Save the notebook using the save icon or File->Save Notebook menu item
Answer String: ';\n console.log(responses);\n responses.forEach((response, index) => {\n if (response) {\n console.log(index + ': ' + response);\n stringResponses+= index + ': ' + response +\" \";\n }\n });\n responsesContainer.innerHTML=stringResponses;\n}\nfunction check_mc() {\n var id = this.id.split('-')[0];\n //var response = this.id.split('-')[1];\n //console.log(response);\n //console.log(\"In check_mc(), id=\"+id);\n //console.log(event.srcElement.id) \n //console.log(event.srcElement.dataset.correct) \n //console.log(event.srcElement.dataset.feedback)\n\n var label = event.srcElement;\n //console.log(label, label.nodeName);\n var depth = 0;\n while ((label.nodeName != \"LABEL\") && (depth < 20)) {\n label = label.parentElement;\n console.log(depth, label);\n depth++;\n }\n\n\n\n var answers = label.parentElement.children;\n\n //console.log(answers);\n\n\n // Split behavior based on multiple choice vs many choice:\n var fb = document.getElementById(\"fb\" + id);\n\n\n\n\n if (fb.dataset.numcorrect == 1) {\n // What follows is for the saved responses stuff\n var outerContainer = fb.parentElement.parentElement;\n var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n if (responsesContainer) {\n //console.log(responsesContainer);\n var response = label.firstChild.innerText;\n if (label.querySelector(\".QuizCode\")){\n response+= label.querySelector(\".QuizCode\").firstChild.innerText;\n }\n console.log(response);\n //console.log(document.getElementById(\"quizWrap\"+id));\n var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n console.log(\"Question \" + qnum);\n //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n var responses=JSON.parse(responsesContainer.dataset.responses);\n console.log(responses);\n responses[qnum]= response;\n responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n printResponses(responsesContainer);\n }\n // End code to preserve responses\n \n for (var i = 0; i < answers.length; i++) {\n var child = answers[i];\n //console.log(child);\n child.className = \"MCButton\";\n }\n\n\n\n if (label.dataset.correct == \"true\") {\n // console.log(\"Correct action\");\n if (\"feedback\" in label.dataset) {\n fb.textContent = jaxify(label.dataset.feedback);\n } else {\n fb.textContent = \"Correct!\";\n }\n label.classList.add(\"correctButton\");\n\n fb.className = \"Feedback\";\n fb.classList.add(\"correct\");\n\n } else {\n if (\"feedback\" in label.dataset) {\n fb.textContent = jaxify(label.dataset.feedback);\n } else {\n fb.textContent = \"Incorrect -- try again.\";\n }\n //console.log(\"Error action\");\n label.classList.add(\"incorrectButton\");\n fb.className = \"Feedback\";\n fb.classList.add(\"incorrect\");\n }\n }\n else {\n var reset = false;\n var feedback;\n if (label.dataset.correct == \"true\") {\n if (\"feedback\" in label.dataset) {\n feedback = jaxify(label.dataset.feedback);\n } else {\n feedback = \"Correct!\";\n }\n if (label.dataset.answered <= 0) {\n if (fb.dataset.answeredcorrect < 0) {\n fb.dataset.answeredcorrect = 1;\n reset = true;\n } else {\n fb.dataset.answeredcorrect++;\n }\n if (reset) {\n for (var i = 0; i < answers.length; i++) {\n var child = answers[i];\n child.className = \"MCButton\";\n child.dataset.answered = 0;\n }\n }\n label.classList.add(\"correctButton\");\n label.dataset.answered = 1;\n fb.className = \"Feedback\";\n fb.classList.add(\"correct\");\n\n }\n } else {\n if (\"feedback\" in label.dataset) {\n feedback = jaxify(label.dataset.feedback);\n } else {\n feedback = \"Incorrect -- try again.\";\n }\n if (fb.dataset.answeredcorrect > 0) {\n fb.dataset.answeredcorrect = -1;\n reset = true;\n } else {\n fb.dataset.answeredcorrect--;\n }\n\n if (reset) {\n for (var i = 0; i < answers.length; i++) {\n var child = answers[i];\n child.className = \"MCButton\";\n child.dataset.answered = 0;\n }\n }\n label.classList.add(\"incorrectButton\");\n fb.className = \"Feedback\";\n fb.classList.add(\"incorrect\");\n }\n // What follows is for the saved responses stuff\n var outerContainer = fb.parentElement.parentElement;\n var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n if (responsesContainer) {\n //console.log(responsesContainer);\n var response = label.firstChild.innerText;\n if (label.querySelector(\".QuizCode\")){\n response+= label.querySelector(\".QuizCode\").firstChild.innerText;\n }\n console.log(response);\n //console.log(document.getElementById(\"quizWrap\"+id));\n var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n console.log(\"Question \" + qnum);\n //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n var responses=JSON.parse(responsesContainer.dataset.responses);\n if (label.dataset.correct == \"true\") {\n if (typeof(responses[qnum]) == \"object\"){\n if (!responses[qnum].includes(response))\n responses[qnum].push(response);\n } else{\n responses[qnum]= [ response ];\n }\n } else {\n responses[qnum]= response;\n }\n console.log(responses);\n responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n printResponses(responsesContainer);\n }\n // End save responses stuff\n\n\n\n var numcorrect = fb.dataset.numcorrect;\n var answeredcorrect = fb.dataset.answeredcorrect;\n if (answeredcorrect >= 0) {\n fb.textContent = feedback + \" [\" + answeredcorrect + \"/\" + numcorrect + \"]\";\n } else {\n fb.textContent = feedback + \" [\" + 0 + \"/\" + numcorrect + \"]\";\n }\n\n\n }\n\n if (typeof MathJax != 'undefined') {\n var version = MathJax.version;\n console.log('MathJax version', version);\n if (version[0] == \"2\") {\n MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n } else if (version[0] == \"3\") {\n MathJax.typeset([fb]);\n }\n } else {\n console.log('MathJax not detected');\n }\n\n}\n\nfunction make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id) {\n var shuffled;\n if (shuffle_answers == \"True\") {\n //console.log(shuffle_answers+\" read as true\");\n shuffled = getRandomSubarray(qa.answers, qa.answers.length);\n } else {\n //console.log(shuffle_answers+\" read as false\");\n shuffled = qa.answers;\n }\n\n\n var num_correct = 0;\n\n\n\n shuffled.forEach((item, index, ans_array) => {\n //console.log(answer);\n\n // Make input element\n var inp = document.createElement(\"input\");\n inp.type = \"radio\";\n inp.id = \"quizo\" + id + index;\n inp.style = \"display:none;\";\n aDiv.append(inp);\n\n //Make label for input element\n var lab = document.createElement(\"label\");\n lab.className = \"MCButton\";\n lab.id = id + '-' + index;\n lab.onclick = check_mc;\n var aSpan = document.createElement('span');\n aSpan.classsName = \"\";\n //qDiv.id=\"quizQn\"+id+index;\n if (\"answer\" in item) {\n aSpan.innerHTML = jaxify(item.answer);\n //aSpan.innerHTML=item.answer;\n }\n lab.append(aSpan);\n\n // Create div for code inside question\n var codeSpan;\n if (\"code\" in item) {\n codeSpan = document.createElement('span');\n codeSpan.id = \"code\" + id + index;\n codeSpan.className = \"QuizCode\";\n var codePre = document.createElement('pre');\n codeSpan.append(codePre);\n var codeCode = document.createElement('code');\n codePre.append(codeCode);\n codeCode.innerHTML = item.code;\n lab.append(codeSpan);\n //console.log(codeSpan);\n }\n\n //lab.textContent=item.answer;\n\n // Set the data attributes for the answer\n lab.setAttribute('data-correct', item.correct);\n if (item.correct) {\n num_correct++;\n }\n if (\"feedback\" in item) {\n lab.setAttribute('data-feedback', item.feedback);\n }\n lab.setAttribute('data-answered', 0);\n\n aDiv.append(lab);\n\n });\n\n if (num_correct > 1) {\n outerqDiv.className = \"ManyChoiceQn\";\n } else {\n outerqDiv.className = \"MultipleChoiceQn\";\n }\n\n return num_correct;\n\n}\nfunction check_numeric(ths, event) {\n\n if (event.keyCode === 13) {\n ths.blur();\n\n var id = ths.id.split('-')[0];\n\n var submission = ths.value;\n if (submission.indexOf('/') != -1) {\n var sub_parts = submission.split('/');\n //console.log(sub_parts);\n submission = sub_parts[0] / sub_parts[1];\n }\n //console.log(\"Reader entered\", submission);\n\n if (\"precision\" in ths.dataset) {\n var precision = ths.dataset.precision;\n // console.log(\"1:\", submission)\n submission = Math.round((1 * submission + Number.EPSILON) * 10 ** precision) / 10 ** precision;\n // console.log(\"Rounded to \", submission, \" precision=\", precision );\n }\n\n\n //console.log(\"In check_numeric(), id=\"+id);\n //console.log(event.srcElement.id) \n //console.log(event.srcElement.dataset.feedback)\n\n var fb = document.getElementById(\"fb\" + id);\n fb.style.display = \"none\";\n fb.textContent = \"Incorrect -- try again.\";\n\n var answers = JSON.parse(ths.dataset.answers);\n //console.log(answers);\n\n var defaultFB = \"\";\n var correct;\n var done = false;\n answers.every(answer => {\n //console.log(answer.type);\n\n correct = false;\n // if (answer.type==\"value\"){\n if ('value' in answer) {\n if (submission == answer.value) {\n fb.textContent = jaxify(answer.feedback);\n correct = answer.correct;\n //console.log(answer.correct);\n done = true;\n }\n // } else if (answer.type==\"range\") {\n } else if ('range' in answer) {\n //console.log(answer.range);\n if ((submission >= answer.range[0]) && (submission < answer.range[1])) {\n fb.textContent = jaxify(answer.feedback);\n correct = answer.correct;\n //console.log(answer.correct);\n done = true;\n }\n } else if (answer.type == \"default\") {\n defaultFB = answer.feedback;\n }\n if (done) {\n return false; // Break out of loop if this has been marked correct\n } else {\n return true; // Keep looking for case that includes this as a correct answer\n }\n });\n\n if ((!done) && (defaultFB != \"\")) {\n fb.innerHTML = jaxify(defaultFB);\n //console.log(\"Default feedback\", defaultFB);\n }\n\n fb.style.display = \"block\";\n if (correct) {\n ths.className = \"Input-text\";\n ths.classList.add(\"correctButton\");\n fb.className = \"Feedback\";\n fb.classList.add(\"correct\");\n } else {\n ths.className = \"Input-text\";\n ths.classList.add(\"incorrectButton\");\n fb.className = \"Feedback\";\n fb.classList.add(\"incorrect\");\n }\n\n // What follows is for the saved responses stuff\n var outerContainer = fb.parentElement.parentElement;\n var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n if (responsesContainer) {\n console.log(submission);\n var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n //console.log(\"Question \" + qnum);\n //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n var responses=JSON.parse(responsesContainer.dataset.responses);\n console.log(responses);\n if (submission == ths.value){\n responses[qnum]= submission;\n } else {\n responses[qnum]= ths.value + \"(\" + submission +\")\";\n }\n responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n printResponses(responsesContainer);\n }\n // End code to preserve responses\n\n if (typeof MathJax != 'undefined') {\n var version = MathJax.version;\n console.log('MathJax version', version);\n if (version[0] == \"2\") {\n MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n } else if (version[0] == \"3\") {\n MathJax.typeset([fb]);\n }\n } else {\n console.log('MathJax not detected');\n }\n return false;\n }\n\n}\n\nfunction isValid(el, charC) {\n //console.log(\"Input char: \", charC);\n if (charC == 46) {\n if (el.value.indexOf('.') === -1) {\n return true;\n } else if (el.value.indexOf('/') != -1) {\n var parts = el.value.split('/');\n if (parts[1].indexOf('.') === -1) {\n return true;\n }\n }\n else {\n return false;\n }\n } else if (charC == 47) {\n if (el.value.indexOf('/') === -1) {\n if ((el.value != \"\") && (el.value != \".\")) {\n return true;\n } else {\n return false;\n }\n } else {\n return false;\n }\n } else if (charC == 45) {\n var edex = el.value.indexOf('e');\n if (edex == -1) {\n edex = el.value.indexOf('E');\n }\n\n if (el.value == \"\") {\n return true;\n } else if (edex == (el.value.length - 1)) { // If just after e or E\n return true;\n } else {\n return false;\n }\n } else if (charC == 101) { // \"e\"\n if ((el.value.indexOf('e') === -1) && (el.value.indexOf('E') === -1) && (el.value.indexOf('/') == -1)) {\n // Prev symbol must be digit or decimal point:\n if (el.value.slice(-1).search(/\\d/) >= 0) {\n return true;\n } else if (el.value.slice(-1).search(/\\./) >= 0) {\n return true;\n } else {\n return false;\n }\n } else {\n return false;\n }\n } else {\n if (charC > 31 && (charC < 48 || charC > 57))\n return false;\n }\n return true;\n}\n\nfunction numeric_keypress(evnt) {\n var charC = (evnt.which) ? evnt.which : evnt.keyCode;\n\n if (charC == 13) {\n check_numeric(this, evnt);\n } else {\n return isValid(this, charC);\n }\n}\n\n\n\n\n\nfunction make_numeric(qa, outerqDiv, qDiv, aDiv, id) {\n\n\n\n //console.log(answer);\n\n\n outerqDiv.className = \"NumericQn\";\n aDiv.style.display = 'block';\n\n var lab = document.createElement(\"label\");\n lab.className = \"InpLabel\";\n lab.textContent = \"Type numeric answer here:\";\n aDiv.append(lab);\n\n var inp = document.createElement(\"input\");\n inp.type = \"text\";\n //inp.id=\"input-\"+id;\n inp.id = id + \"-0\";\n inp.className = \"Input-text\";\n inp.setAttribute('data-answers', JSON.stringify(qa.answers));\n if (\"precision\" in qa) {\n inp.setAttribute('data-precision', qa.precision);\n }\n aDiv.append(inp);\n //console.log(inp);\n\n //inp.addEventListener(\"keypress\", check_numeric);\n //inp.addEventListener(\"keypress\", numeric_keypress);\n /*\n inp.addEventListener(\"keypress\", function(event) {\n return numeric_keypress(this, event);\n }\n );\n */\n //inp.onkeypress=\"return numeric_keypress(this, event)\";\n inp.onkeypress = numeric_keypress;\n inp.onpaste = event => false;\n\n inp.addEventListener(\"focus\", function (event) {\n this.value = \"\";\n return false;\n }\n );\n\n\n}\nfunction jaxify(string) {\n var mystring = string;\n\n var count = 0;\n var loc = mystring.search(/([^\\\\]|^)(\\$)/);\n\n var count2 = 0;\n var loc2 = mystring.search(/([^\\\\]|^)(\\$\\$)/);\n\n //console.log(loc);\n\n while ((loc >= 0) || (loc2 >= 0)) {\n\n /* Have to replace all the double $$ first with current implementation */\n if (loc2 >= 0) {\n if (count2 % 2 == 0) {\n mystring = mystring.replace(/([^\\\\]|^)(\\$\\$)/, \"$1\\\\[\");\n } else {\n mystring = mystring.replace(/([^\\\\]|^)(\\$\\$)/, \"$1\\\\]\");\n }\n count2++;\n } else {\n if (count % 2 == 0) {\n mystring = mystring.replace(/([^\\\\]|^)(\\$)/, \"$1\\\\(\");\n } else {\n mystring = mystring.replace(/([^\\\\]|^)(\\$)/, \"$1\\\\)\");\n }\n count++;\n }\n loc = mystring.search(/([^\\\\]|^)(\\$)/);\n loc2 = mystring.search(/([^\\\\]|^)(\\$\\$)/);\n //console.log(mystring,\", loc:\",loc,\", loc2:\",loc2);\n }\n\n //console.log(mystring);\n return mystring;\n}\n\n\nfunction show_questions(json, mydiv) {\n console.log('show_questions');\n //var mydiv=document.getElementById(myid);\n var shuffle_questions = mydiv.dataset.shufflequestions;\n var num_questions = mydiv.dataset.numquestions;\n var shuffle_answers = mydiv.dataset.shuffleanswers;\n\n if (num_questions > json.length) {\n num_questions = json.length;\n }\n\n var questions;\n if ((num_questions < json.length) || (shuffle_questions == \"True\")) {\n //console.log(num_questions+\",\"+json.length);\n questions = getRandomSubarray(json, num_questions);\n } else {\n questions = json;\n }\n\n //console.log(\"SQ: \"+shuffle_questions+\", NQ: \" + num_questions + \", SA: \", shuffle_answers);\n\n // Iterate over questions\n questions.forEach((qa, index, array) => {\n //console.log(qa.question); \n\n var id = makeid(8);\n //console.log(id);\n\n\n // Create Div to contain question and answers\n var iDiv = document.createElement('div');\n //iDiv.id = 'quizWrap' + id + index;\n iDiv.id = 'quizWrap' + id;\n iDiv.className = 'Quiz';\n iDiv.setAttribute('data-qnum', index);\n mydiv.appendChild(iDiv);\n // iDiv.innerHTML=qa.question;\n\n var outerqDiv = document.createElement('div');\n outerqDiv.id = \"OuterquizQn\" + id + index;\n\n iDiv.append(outerqDiv);\n\n // Create div to contain question part\n var qDiv = document.createElement('div');\n qDiv.id = \"quizQn\" + id + index;\n //qDiv.textContent=qa.question;\n qDiv.innerHTML = jaxify(qa.question);\n\n outerqDiv.append(qDiv);\n\n // Create div for code inside question\n var codeDiv;\n if (\"code\" in qa) {\n codeDiv = document.createElement('div');\n codeDiv.id = \"code\" + id + index;\n codeDiv.className = \"QuizCode\";\n var codePre = document.createElement('pre');\n codeDiv.append(codePre);\n var codeCode = document.createElement('code');\n codePre.append(codeCode);\n codeCode.innerHTML = qa.code;\n outerqDiv.append(codeDiv);\n //console.log(codeDiv);\n }\n\n\n // Create div to contain answer part\n var aDiv = document.createElement('div');\n aDiv.id = \"quizAns\" + id + index;\n aDiv.className = 'Answer';\n iDiv.append(aDiv);\n\n //console.log(qa.type);\n\n var num_correct;\n if (qa.type == \"multiple_choice\") {\n num_correct = make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id);\n } else if (qa.type == \"many_choice\") {\n num_correct = make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id);\n } else if (qa.type == \"numeric\") {\n //console.log(\"numeric\");\n make_numeric(qa, outerqDiv, qDiv, aDiv, id);\n }\n\n\n //Make div for feedback\n var fb = document.createElement(\"div\");\n fb.id = \"fb\" + id;\n //fb.style=\"font-size: 20px;text-align:center;\";\n fb.className = \"Feedback\";\n fb.setAttribute(\"data-answeredcorrect\", 0);\n fb.setAttribute(\"data-numcorrect\", num_correct);\n iDiv.append(fb);\n\n\n });\n var preserveResponses = mydiv.dataset.preserveresponses;\n console.log(preserveResponses);\n console.log(preserveResponses == \"true\");\n if (preserveResponses == \"true\") {\n console.log(preserveResponses);\n // Create Div to contain record of answers\n var iDiv = document.createElement('div');\n iDiv.id = 'responses' + mydiv.id;\n iDiv.className = 'JCResponses';\n // Create a place to store responses as an empty array\n iDiv.setAttribute('data-responses', '[]');\n\n // Dummy Text\n iDiv.innerHTML=\"Select your answers and then follow the directions that will appear here.\"\n //iDiv.className = 'Quiz';\n mydiv.appendChild(iDiv);\n }\n//console.log(\"At end of show_questions\");\n if (typeof MathJax != 'undefined') {\n console.log(\"MathJax version\", MathJax.version);\n var version = MathJax.version;\n setTimeout(function(){\n var version = MathJax.version;\n console.log('After sleep, MathJax version', version);\n if (version[0] == \"2\") {\n MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n } else if (version[0] == \"3\") {\n MathJax.typeset([mydiv]);\n }\n }, 500);\nif (typeof version == 'undefined') {\n } else\n {\n if (version[0] == \"2\") {\n MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n } else if (version[0] == \"3\") {\n MathJax.typeset([mydiv]);\n } else {\n console.log(\"MathJax not found\");\n }\n }\n }\n return false;\n}\n\n {\n show_questions(questionsMpQrXvjLDoZX, MpQrXvjLDoZX);\n }\n ",
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "!pip install jupyterquiz==2.0.7 --quiet\n",
+ "from jupyterquiz import display_quiz\n",
+ "\n",
+ "display_quiz('quiz_files/quiz1.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "170813e7-81c7-4224-9539-bcc8c36f9961",
+ "metadata": {},
+ "source": [
+ "# Prerequisites\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ad938220-3e5a-455f-a725-a8329672a0d8",
+ "metadata": {},
+ "source": [
+ "*** \\* If the libraries/packages have not been installed:***\n",
+ "\n",
+ " Use !pip install or \n",
+ "!conda install -c anaconda -y \n",
+ "\n",
+ " replacing with the library or package name you are trying to download "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8e171f43-88c8-432e-8e57-f9f918f1beea",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "70bbc537-63d0-44f9-8705-0f0fb3dc494d",
+ "metadata": {},
+ "source": [
+ "### Step 1) Importing necessary packages into the notebook\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "7e8257e4-ef07-4b9b-bd9b-d4afd3f91548",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhwaGRoeHRsfIjAmIyIiIygnMCcyLjIyMy0tLy81PVBCNzhLOS0tRWFFS1NWW1xbNUFlbWRYbFBZW1cBERISGRYYLRoaMFc3NTZXV11XV1dXV1dXV1dXV1djV1dXV1dXV1dXV1dXWF9bV1dXV1dXV1dXV1dXV1dXY1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAQIDBgcEBf/EAEoQAAEDAQQDCQoOAgICAwEAAAABAhEDBBIhMUFRYQUHEyJTcZGS0RQWMmOBk6Gx4fAGFRcjMzVSVGJyc7LB0kJDgvEkoqPC8jT/xAAYAQEBAQEBAAAAAAAAAAAAAAAAAQIEA//EACERAQEBAQACAgIDAQAAAAAAAAARAQISIQMxQVEiMtEE/9oADAMBAAIRAxEAPwDn4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJjanpEbU9IEAmNqekRtT0gQCY2p6RG1PSBAJjanpEbU9IEAmNqekRtT0gQD6u4/wfr21HrRuQyEW86M5jRsPpd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7d4rr+wd4du8V1/YBrANn7w7d4rr+wd4du8V1/YBrANn7wrd4rr+wd4Vu8V1/YBrANn7wrd4rr+wd4Vu8V1/YBrANn7w7d4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANob8Aras40Uj8a9hb5P7b9qj117ANVBtD/AIBW1sStHH8a6PIfNt+4LrOiLVr2dFX/AAR7ld0I3Anllg+SD6G5W5D7W5WU6lJHaGvddV3NhifX7w7bKJNHH8a9g3cwawDZ0+Ads+1R669h590/gfarLQdXqrSuMibrlVcVRE0bRcHwATd2p6RG1PSUQCY2p6RG1PSBAJjanpEbU9IEAmNqekRtT0gQAAAAAAAAAAAAA37e08C0/mZ6nG5JM/5ac4jZtNS3rk+btX5mepxu1nermy5isXUsGdyjyotTU3pUlFfpRMs5PY1ZVeLEGKjUc5zkVqIibF1rhikaJw1mxgVXzgiRziX6mp5TNVrKlRjUYqo6ZciYNjXzlXV1RfAXPQEY5fjgmzEs29pjyFlrr9h3QVS0umODdzx0AGXoS9E6YyKtWpCyjdn8k91LyT+hCXWpUzpuzhMMyT2qk1NTelSZfhg3b7CXWtU/1vnVCFltCzhTd0IVGNOE/CS2/pgycOv2HRzDhlzuL/OgDHL9TS0uurle0aizqrtDJQhLQ5f9a7NoGWnN1Yi8SxakYokz6P8Asw90PT/BffyE90P+wpJ7VmR1SMkRechVqYwjVxwxMfdDo8BxHdLp8BwGaamODdntK/O/hMSWh/JqT3Q7DiO2gZZqfhLvV2F2M8Z1dp51ru0MXykrXdE3XfyBavMrGZipzdS9F6MYJ7od9h0jh3cm7oKiQXp1JnCI2Zl5AwiCzaiq5UVsJoWCEqrfu3cNYEKSOFXhLt3CJvC21nU2XmtvLCrEKswkomGsCqNhZDmykKebdi3VKNNjmI1FdmrkwTCY8p6rHXc+ztqOZDlbMe+sT1WM7zevD8vE6s5LRcTSiKRVtrmKt7SuEbMy9dItk6LiJ6/YeHdF6cKjtDWqvOqrCeo4vk3c3Zq7rHuzuutKlVrNiWMutn7S5elTV/gzuAlsa6tWl7lVc/SvSZPhHVVaFxHXYeiqir4a49qL5DbPg1YlZY6D21HQrEW5CRLsZyn0nv8A83P8fLfyZ16aB8JdwVsbmuaiok/9Kb38GrW+tYab3uvvaicbWmpdqZLzGL4S7nrWslSo+q5ODavFwuqqZLlJ8He9qvbw6SisWFicUVNm2V6FN/PznjWs1udJ2KbX4ev+D4/w3rTufWTWjfRUb2n02fSNVMpn+D4fw2cncD0iFlJ8rmqcXHXW7m7+/wDCuagA71AAAAAAAAAAAAAAAAAAAAAHRN65Pm7V+Znqcbqtop/bbnGaGlb16fNWpPxM9Tja+4Ho1Gte1LrbqSycOnM8++us/rla5zN+3srVWsarnLCduCFVtDdaxCLPPkUe1GU2suLUREiM8ETT0EOVtxruCnRdupKJzeQ3jK7bTTXJyLp7PWhTuunhisLMLGGCx/ApI24q8EqaFbGenIU3NesLSVMM3NSOnnKJW10pu3sebPRhrJS007quRZRM8NeRjvtauFF0zCKjE0YFnQ1Gu4Ncc0RMudEzAhbbTicYmEWM/eR3dSiZWFSUWM8+wjhm4rwD8vspMGR6o1EclOZTGESUAhlpY5yNSZVJy0BbVTRWoqxeRFTDOZj1E0XIsxTcyMcrs9BjZUa5UatF6T9pJT/oCyWykqxe9GXoLMtFNzkajuMsqic2ZV7qaOVFppzwnvpKpXpp/rjmanMBfuqlE3jJTe1zbyLhzGBtWkiSjM5whNMT77DLQqNXitbdRNiIBkbCpKE3SUSCQK3RdLACt0XSwArdF0sAK3RdLEARCC6fO3O3FZZkejHuVHNamKNTwZ1Ik5nr7l8ZU6wGa6Lpjo0Lizfc7D/JZMdqsjqj2uSq5iN0IiY4osr0IB6LpMGCrZbzlW+9qr9lYLpS4itvOWZxVcUkC6pJ5m7o0VqcElRL8xG3VOUjc2w9z01ZfV8umV0YImCeSedVU+LS3BqpWbKpca+9enFUmYiM8DWZm/bx+XvrmeOVk3VtEWtW6mNd03k/g+Tbay1KsN1J06PWef4aWhtO38Z7mzRb4Lo0vn+Og174wooquV1R2UQ/HTOnanQcfXxbvW7+2uud19ndKsymiuVzVliJdWMbyL6ZPofBDdnhaK2WpLVpQrXJpauUpsNLqWmyuWXNqq6c5Rde3mPbuTupZ6FZKjUqIqa3JDkxzOj4ePDJq5zGyfDDdTgrOtCnedwixeXLb6j5fwW3WpMoLZ2tu1nPlzljFNnMfH3Y3cqWysi1Fa1jfBa3JNu1T5VZUvShv5s8+dwzHX6Vvp3lZpa3BdZr/wANKzXWSpC4y31tNa3J3XhLtV+UXVX1GTdjdGnVoOa16KuGHlTsOPc68s5n1p7sa6ADrbAAAAAAAAAAAAAAAAAAAAAHQ9676K1fmZ6nG8oqLiiymxTRt69JpWpPxM9TjeG00aiI3BAD3xODlhfeClKveWLr0wzVIQ1/4X7rV7JTpLRfdVz3IstauCJtQ1fvwt/LJ5tnYWDpT6kIq3XLC9PMKNS/PFckfawOa9+Fv5ZPNs7B332/lk82zsESunQIOY999v5ZPNs7B332/lk82zsEK6dAg5j332/lk82zsHffb+WTzbOwkK6dAg5j332/lk82zsHffb+WTzbOwQrp0CDmPffb+WTzbOwd+Fv5ZPNs7Cwrp0CDmPffb+WTzbOwd99v5ZPNs7BCunQIOY999v5ZPNs7B332/lk82zsEK6dAg5j332/lk82zsHffb+WTzbOwQrp0CDmPffb+WTzbOwd99v5ZPNs7BFdOgQcx777fyyebZ2Dvvt/LJ5tnYIOnQIOY999v5ZPNs7B332/lk82zsESuk1aio1VTCNLpjPEwttPFVznshFSVRVw1nPF+F1u01m+bZ2H1Pg9urbbY+oxbQjEYy9hRprOOWgkG6tqS281ZRUwXEyaYhcs9HMfGdYrWjUVbauMYdzU9PMS6yWxHI3u52OnuenAV9SvVuNmHOVcEamaquj2kPrRgrXKuzE+PRpW11RzFtjmxOK0GQsKifyTwNr4PhO7luzhNnppO3mA+zTqo5VSHIqY4llWIwVZ1aOc+ZRqVqDEdXrOr33ta27Ta27M44ZofSXOJXLZo/wCwOY75X1gz9Fv7nGpG174//wDe3T8y39zjVAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOh716TStULC3m4+RxvDEVERHLK44mjb2LrtG1rqcz1ON5p1L2MKkYY+TtA1X4d2WpVpUEpU31FR7puNV0YaYNN+KbV92r+af2HWala5HFc6VXwUmCr7ZGVN6yiLlrLUjlPxTavu1fzT+wfFNq+7V/NP7DrHdXGu3X5xMYadPkIZa5VEuPRV/DgnlFI5R8U2r7tX80/sHxTavu1fzT+w6t3ZlxHwqavWO6/wP6BSOU/FNq+7V/NP7B8U2r7tX80/sOsOtMZsdEIuCTnrIda0SOI/GNGtJFI5R8U2r7tX80/sHxTavu1fzT+w6s+2XVxY/oLOtSIqpddgmr31ikcn+KbV92r+af2D4ptX3av5p/YdW7s/A/oDrXGdN8RoTaqfwKRyn4ptX3av5p/YPim1fdq/mn9h1d1qhVS4/OJghLaiqnFek5SnlFI5T8U2r7tX80/sHxTavu1fzT+w6l8ZNmODq9Tm7R8YtlESnVWYyblKTjqFI5b8U2r7tX80/sHxTavu1fzT+w6o62qj0ZwT8ViYwziStPdCXInBVExzVMNPv0ikct+KbV92r+af2D4ptX3av5p/YdQ+NWxPB1YWP8ADXEadpmo2y8ifNvSVjFMsvRj6BSOU/FNq+7V/NP7B8U2r7tX80/sOrd2J9ip1Q61wsXH6MYFI5T8U2r7tX80/sHxTavu1fzT+w6qttwng34zGGOEdvoLd1JE3H5xEYikco+KbV92r+af2F6VgttNZp0bSxVwljajV6UOqLac+I9Y2ZhLWkTcfnGQpHMOC3S1W7prDgt0tVu6ax01bZhNx/QelFlBSOU8Fulqt3TWHBbo6rd/8x1YkUan8CKdoiv3SlfNt3hb/wCKYveQ2ng01FwRXLN8j6wb+in7nmpm2b5H1g39FP3PNTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6HvXfRWr8zPU43s0PexYjqNrauSq1PQ43mlSRmCacdH8AVfVup4KriuWgq61IiTcf1SzqqtTwVdiuWgq61Qk3Hrsu4gQtqhfo3xE5EraoWLj8tCELalRfo3xE5EralRY4N+WhAC2nKGPx2bYxIW1x/hUjm9pK2ldFN/RtghbXGPB1MpyAllplUS49PIVS2p9ip1SzLTKolx6eQq22zjwdRP8AiAda1wim5VifYT3Z4up0J2hbUuHzb8kyTLCcR3X4t/QBL7WiLCsfz3cCUtH4H4pq58PR6SHWlU/1uVNEY6JxC2pYReDfqyyy7QIW1+LfPN/JK2rCbj1zyScpT+CHWuG3lY5MYx9ZjTdFJi6vPKaizRk7r8XU6AtriOI/FEXBNZjTdJq/4u9Hae0bkHlfbUbMseiJsTtLOtUKqXH4bMzOSQeZLVP+t+U5bSzbTP8Ag9MFXFIyM4A8SbpM+y6dWBLd0WLocmCrjGjynsBq4PGu6LJiHLnlGjyhN0Wqi4OwSdGJ6wLg8fxmzDB3oPRZ66VG3kw0GUgmwSACAAAAAA5XvkfWDf0k/c41Q2vfI+sG/pJ+5xqgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9Gi71v0dq/Mz1ON4eqokokqiLhr2AVVr9Ctjai9oiprZ0L2nnttpqsSmrKcq5eM2FcqbMME59mk81LdeormNdZ3orlamnTF5USMknSB9GKmtvQvaIqa29C9p5KlrqotRLqKqI9WpC43WtVq7cXKhhqbrVG52d2KqiYuxiPw6dHMB9GKmtvQvaIqa29C9p4V3Rqqi3bO5IVJvXslciYIjcVRFXZhmpVm6771Nr6Dmq9YRVVdcZRnpjVpA+hFTW3oXtEVNbehe08C7qVEWO5n6ccYyRU0beySV3UqSv/AI1TBc8ccJ1e/oA90VNbehe0RU1t6F7T57t13pKrZ3ojW3lVVVEi6rlzbnhHlJrbqvasJQc5UpteqJKql5VwwTZ7NIHviprb0L2iKmtvQvafPdupVvJ/477sO0OwVIhFw045Sm0pT3ac511LO+Uuyk4tvIvhJGGXp0ZAfTiprb0L2iKmtnQvaeJ26FSKapQct9FVyY8WNsaYwwxlCKe6FZzHTQVr2tRcnKk4ThCKsSuCZwuWEh7oqa2dC9oiprb0L2nz27qVYSbPUVcMYVJw1YxOSY88ZlLPuw+o7Cg5WXkbhKwqyrlVY0YJGvSB9OKmtvQvaIqa29C9p4XW+q1XJwSu491ERrkhJVEWcnSkLhCIUs+6dZ1RjVoORFSHKrXJjxcUwy8LPpA+jFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbehe0ygDle+Qn/ntnPgW/ucaobbvlfWDf0W+txqQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9KaLvW/R2r8zPU43pQMda0NZF5YmfQVS1U1hL2KrCYKZXNRc0RecwVLRSa29gqI67gk45RzgT3ZTzvIWS1U1/yyx0nn+MbP8AbbpxjVE+tCzLXZ3Owc1VhdGjNfUBm7pZE3kjsDrTTRcXImnGdJTuiiqTLVTDRry9QW10daL5F7AJS2U8eNlnsJW100RFV6QuRVlopKt1I6OaE9Q7po5SnQvYBbuqmv8AlguxffSO66c+F6FKd1UdbYzmOZOwJaKLl0KqToywxAu62U0RXXsOZfQEtdOYnHmXErw9KUbhK5JAbaqToxSVmEVOkCe7aWHHTGPTkWW1U0iXJjihiWvRlJidHFXsLLXpJnGzDbEdIFltdNM3RhOM6cg21MV11Fx5l0leGpYLxdSLGrQGWmkvgwsJOCaE0oBZLXTWONnlgoS2U4m96FKJaaKzllK4ZImsnuilCrgqZLh5QLOtdNsorojYvo1k90slUvJKTPkzKJaKTtXlbs5h3VR1p0L2AWS101njYJpCWymuTp5kUqlopTCKkrGEa8glelE4Rlg3m7UAu61U0zciDulmPGyx6f8AtDGtpo629Bda1OVTDDPD31egA2101WEdjzKT3VTw4yY5Z4wVbaKU4KkxOXSV7qozdlNOjpQC/dVOFW8kTHlXINtdNUm9htRSFtFJM1RPIV7qo6015AXda6aIiq7BdOPMZKdVr0lqymRjdVpIqot3iwmWU4p6irbXRRMHIiLjMQmoD0gx0qzXzdWYwUyAct3yvrBv6LfW41I23fK+sG/ot/c41IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADom9b9HavzM9TjelNF3rfo7V+Znqcb1pAkpUcqJKJOJcwVKlS7LaeN6IVUy1gY2vW7PAxsw7CjXq3BtnhNER5SO660YWdU1S5DJTtNVVhaCtTHG8i6F/6Al1Vyf6ZTDJU98Bwjowo4zC4psy99BZtd6pjSVFwwva8+gqtoqaKK+V3sAqtV2iipKvWGxR0rqwjIltepeRFpKiZTJLq9SMKSzj/lz+/lAqtR2HzMpH882oMeqJhRiMETDLTo2GRar5WGYJljmUSvU5JekCFrO0UV2ZFkqOieCx99gSvU5L/29hKVn8muU56YyAq2qqtngoVFiF5uYjhn8gvShd1Z+inPlj+CvdFTkV6fYBLqrpjglXoKpUcuVGMFXHmy/gs6vUT/AFThPhadQdXqJHzSqqpOfSmQFeFdoo+/QWbVVVRFpKiTmOGqT9Hhz7J/glK1SY4JdON5AM11NSEcG2ZupK6YMK16kIvBKutJDa1RXIi0oTXOW0DPcTUnQLqakPPw9XksOfEnhqnJa/8ALVMaNOAGe4mpOgm6mpDzJaai/wCpen2Fn1amMU8lwxzAzXE1J0C4mpOgwNrvhZpLKJgk54htapMLS14z0AZ7qakFxNSdB51r1eS/9vYFtFSPoV63sA9FxMMEwywFxNSdBhbWeqoi04Rc1nI9AFWMRqQiIibCwAHLd8r6wZ+i39zjUjbd8r6wZ+i39zjUgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOib1v0dq/Mz1ON6XM0Xet+jtX5mepxvWkCSlVHKnFWF1xJWtTc6LrlaqFODqwnziT+VNYFeCrL/ALETyFm06ul6L5NmAbSqJE1Jx1JlqDaVRF+klNUIAbTqxi9FXDR0kJSq6aieROb398IWjV5X/wBULJSqyk1NP2UAOp1YSHomvDn9hV1KsuVRE8hbg6kfSafsplhh6yjaVaMaqT+VPXAF3U6sJD0mMcPSHU6mMPTPDBMCEpVNNT0JrMdWzVnKsVrqL+Hm9oGRtKrGNRJ0LHPoJSnVx46aI4uWvnMlFrkaiPdedpWInyGQDzqyrhx0zWcOiCG06uE1E0SkdJ6QB57lXHjoucJEcxRadaPpGzGUHrIAwOp1cYqJnKYaMcPV0EOp1c0emWrTpPSAPLwVblE6Cy06sIl9EWcVjP3xPQAIbMJOZIAAgkAAABBIAAAAAABy3fK+sGfot/c41I23fK+sGfot/c41IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADom9b9HavzM9TjetJou9b9HavzM9TjetIFKjXKqXXQiZpGZibSqpPziLhqTMm1UqjlatOpciZwmco99pSjSqtRyPq3p8FbqJGfT7ALcHW5ROqQ2lVhZqJOEYZa+cs1lSFRXoupYjpIRlWfDbHN/IBtOrhNRNvF6SblXHjps4uWrnJVtTjKjkyW6midElLlbDjtyx4s69vMBKUqvKJMzloxwCUqsLNRJ2ImvsLPbUnivRE1KkkOZUhsPSUTHDMBwdXH5xNnF0z2B1OroeiYahdqXUS+29OKxhHMFbUw4zc1lY0aAHB1ZTjpGnDPX77S1x8Lx8ZwwyMasqx9I3q+0m5Vu+G1XTOWERl0gWRlSPDSZ1YImohtOrpekxq9JaHTi5IjRnOsqjKiNXjorlyVUwTHVzAQtKroqaNKJrX38hK06krFTTpRNfYSjXwsuTRGEc5RKdblGzzbNCc4EvpVVSEqIixise+wm5VhEvpMrjGaaA9lWOK9qLCYqk46SrKdbTUavM2P5AlaVWVipgq6kwTUWuVJXjpGMYdAVlRWxeRHa0T+CEbVh0ubMcVY07UAcHVvIt9I0pHv7+iODq8onRgTwdX7adHP7CXsqSsORE0JH8gUbSqys1EWdnN/Eksp1kzqIv/Hm9pe7U4vGTRew6YK3Kv2m56tGj+AIWlVmeE8kYGamionGWVlcTExlRHYvRW6o/kqjK2l7dGhUx0geoHl4OtyjctDdJdjak4uaqTkiAZwefg6l1eOl5csME8haiyoirfcipohI99IGYAAct3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6JvW/R2n8zPU43pczRd636O1fmZ6nG9LmBjrUUfEqqRqVUKpZmxGOaLnqPHu1up3Kxrkajlc6OM661IRVxXyYHopWu/TpPhU4S6sLolJxAyNsrUmFdikeEoWzNVETGE2qU7qT7L8PwlnWhEVcHYamqKVLLK1FmXTEYuVQlmaiQk4rK44rhBRtqRckdqy5+wLa23UdxlRZyScljIUqzbI1Ecku4yQvGUMsrWrKK7yuVSO6UiYd0GWQMdSyNcqriirpRVT3yDrK1c1d4MeEvTz7TLIkDG+ytdne6VD7K1VVVVcY04YbPIZJEgYe4248Z2OmZ1diFnWVqqiqrsIwvKmU9pkkSBjWytVETGEWYnXiHWVquvYzM4KqauxDJIkDFUsjXZq7yOVDM1sJBEiQLgpIkC4KSJAuCkiQLgxyTIFwUkSBcFJEgXBSRIHMN8r6wZ+i39zjUjbd8r6wZ+i39zjUgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOib130dq/Mz1ON6NF3rfo7V+Znqcb0BSo1FwVEVNSpJitLobN+5tidCmdUIhAPEyrKfTzzN2kV66Nes2hGYxdupsg9qMRMkQKxFzhS77XXjY+61VWvM4NVWpguflwLuVVcjEqw5ExS7iuS+o9NxNhN1JnSEQCYEEEAmEEIBAJhBCAQCYEAQCYQQgEAmEEIBAJhBCAQCYQQgEAmEEIBAJgQBAJgQBAJgQBAJhBCAcw3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6JvW/R2n8zPU43rSaLvW/R2n8zPU43rSBR8ymrSYnOVEbLkTHHTOz2mZxSoixhHlAwNeqzFViokasOcNrJpqsiMNGrHamPpLIx3GljMctvOY+CfydKf+tnP6BDcX4Rc+EZd5v5nUQlVZT5xios886C7mOX/FmfuvOVZTcjV4jEXCI/kTCYq6oqZ1mJEpjCGZzXqiQ5EXXEpsMV16pxmU/WZFWpewRt3DXO0bhBG1NLk8iGVMsczBeq6mdK7PaJq6mL5V99Qgzg87XVpRFazaqKpa9V+yzpUDMAAAAAAAAAAAAAAAAAAAAAAAAAAOY75X1gz9Fv7nGpG275P1gz9Fv7nGpAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdE3rfo7V+Znqcb0aLvW/R2r8zPU43pQKudCwRwiYYpjltDkx5jG9sI2GosLhOjPLaBkSq3WnSEqJhimOKHldTXjLwDVVMuMmPowI46pC0Ey+0i6sPX0D2e3q4RM5TpJSomw8asdh8w2dCSm3T0GR0tdxaSLtlE99Iy6e9Z0qt1p0kpURclTpPKlKEVeBbMYJKYrqI4yz8ykLnKpjn/PrB7evhG606RwiZyhgRqpdTgkxxdCpgv8AJRtOXY0Wo2YmUnTj6ukJ7erhW606QlRFyhSnAtmbqShLKbW5IiAXkSQAqbwkgATIkgATIkgATIkgATIvEACZEkACZEkACZEkACZEkADmO+V9YM/Rb+5xqRtu+V9YM/Rb+5xqQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9KaJvW/R2r8zPU43p7kRJVURESVVdABUMdWijkhctiwUr21lNGKqyj8lbihSnunQcqIlVsrGGnjZSmjMCzbG1EjFedVUh9ia5XKqu4yznEYaOgLugxL83kuI5VlPsoir6FkvUttJqqjqjUVImdEx2p0oXdq1VLGzUvSume0z3Tys3VoKqpfRIVElclnKNhZ+6NJGtdKqxyql5MkuoqrM5QjVXyER6LounndulRRYWommdkRn0hu6dBf8AY1F1AeiBdPNU3Totc5rnxdiedZwTWvFUv3fRuq7hERGxenCJynUBmui6edu6NJXNa16OVyxhohFdj5EIXdShdV3CIqbNMzHTCgem6IPMzdOgv+xMpTakTPQpk7upcXjpxstuMevADLdF086bpUVmKjVW6roRcYTNfQvQQm6dLiqroR16FVI8BUa70r6FA9N0XTzLupQR13hEyVfImfrTpMj7bSb4T0TCfIoGW6Lp53bp0EzqsTGMVgy0LSyorkYs3YRcFTNJ0gXui6XAFLoulwBS6LpcAUui6XAFLoulwBS6LpcAct3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6HvW/R2r8zPU43tTQ969YpWpYnFmHkcb01Z0QBWrZ2Pi81Fu5SmXvCdBhTc2ijmqlNEu4pqlIhY8iGWtQR8SqoqZQse+RRbG2IVz1xnF3qAlLDTx4uDr0poW94WG0q7c6iudNq8+JKWNIi+/OZvZ4R/AWxpM3npzOzwgCG7nUUypNTyashS3PpNppTu3moqrxsfClF9DlTylu5EhEvvwSPCzDrIiqvGck6EUCF3Poyq8G2VmcM5z6SlTcui5E4iJCzhgZHWVFREvvwSMHR0krZkvXrz85icM5Aq/c+i5yuWm1VVZmNOfrI+LaF1W8E2FxVIzie1eksllRFRbz8FnwvWV7iT7dTrAWbYaSORyU2oqZLGyPVgVdubQWJpNwiMNWRZtkRFm8/D8Q7kSFRXOWdazELIFV3OowqcG1JSME1kJubRutarEcjEhL2K689ck9xJPhv63lDrEl1ER70jbrjPXkBZtipJlTanFVuWhZVU9K9IdYaSoiLTaqJMYfaWV6VJo2dGLKOcvOsmcDy/F9Hkm6dGvMs6w0lzptyRMtCZIZwBgWxUlwWm2ObZHqMlKgxk3Wok57YMhAEggASAAAIJAAEASAQBIAA5bvlfWDP0W/ucakbbvlfWDP0W/ucakAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB0Heyfdo2tUSVRWYeRxvFnqq9JVqtxyU4/uB8Ja+56VEotpu4RUVb6OXKYiFTWfW+Ua28lZ+q/wDuB0e1sYt1Xq5ImFSdPMYlZR8NXqqRdzlMf/1mc9XfFti50rMv/B/9yvyg2rkLL1H/ANwOiNZRR0JUWbqp4WSRiqEoylH0joR04uXQkdBzpN8G1JlQsqf8H/3HyhWvkLL1H/3A6IvAwnzi4JHhL6SrmUlbKVXJtny5eU558oFqmeAss/kf/cn5QrXyFl6j/wC4HRLtGVW+qXpwmPC2eUMSjj84qykYrOer30HO/lCtfIWXqP8A7kJvg2rkLL1H/wBwOivZSurx3IiKswu2VQU0pZpVcuGl2yP5Od/KFa+QsuvwH/3Cb4VrTKhZeo/+4HRGrR0Py25zpJppSTJ6wuGeGGPMc6+UK18hZeo/+4+UK1chZeo/+4HRHMpJM1H4LiiKq4qsaE14FHV7OqRwuSZouOO0598oVr5Cy4/gfz/bK9/1p+72THP5t39gOiNqUWw5Ky5xMynl6S1lZSeio2or+LC4zguEx5FOcJ8PLQk/+PZMVlfm35xH29RanvgWpuLaFlbzU3p/9gOldxtxxXHMllkaiKiTikLl2HN/lHtvJWfqv/uPlHtvJWfqv/uB0hLI3HjPxjGccCHWNqoiKrsNvP2nOPlHtvJWfqv/ALj5R7byVn6r/wC4HR+42zMuTLJYySCe5GxF5+SJnqWZ5zm/yj23krP1X/3Hyj23krP1X/3A6OljSIVzlSdfN2EtsiIs3nqsKmK6zm/yj23krP1X/wBx8o9t5Kz9V/8AcDpHcjYiXJzL5SO4m/af0nOPlHtvJWfqv/uPlHtvJWfqv/uB0dbG26jbz4SdOvAdxNu3ZdHPzZdBzj5R7byVn6r/AO4+Ue28lZ+q/wDuB0hbI2ESXJCRgse+ZelQRirCqs61k5p8o9t5Kz9V/wDcfKPbeSs/Vf8A3A6gQcw+Ue28lZ+q/wDuYvlBt/ieovaBffK+sGfot/c41I9+7O69W21Uq1rt5Go3ipCQiqv8ngAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('jPIQbpdTkbM', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c7d3fb42-e3f7-4b3b-a14e-c100c4382491",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "Before working on our model we need to import all packages and specific functions that we will need to use in order to work with our data. \n",
+ "\n",
+ "- **Packages** are essentially prepackaged code that others have made, that are often organized in chunks of code called modules. A package can contain many modules and these modules may contain several functions. \n",
+ "\n",
+ "- **Functions** are essentially a set of instructions to a computer that specify how to handle different types of files, what mathematical equations are used to calculate our model, how our graphs are going to be displayed, etc. \n",
+ "\n",
+ "The code in this notebook is organized in **cells**\n",
+ "\n",
+ "In the example below we will learn how to execute or \"run\" each of the three cells, so that our code actually takes effect. To run the code in a cell, select the cell and press the \"play\" button on the upper part of the notebook menu. \n",
+ "\n",
+ "**Note**: The lines of green text that are preceded by a \"#\" are called comments, they exist only to provide explanations of what each line or chunk of code does. They are not actually part of the code."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5ddada1e-aae9-4b22-87ae-022e0b96a053",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Data Wrangling Imports\n",
+ "import pandas as pd\n",
+ "import numpy as np"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a24d2226-35ca-451b-85d4-757c93eab78a",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Machine Learning Models Imports\n",
+ "from sklearn import tree\n",
+ "from sklearn.tree import DecisionTreeRegressor "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6b29cad9-0cb2-40c8-959a-177b86019a3f",
+ "metadata": {
+ "scrolled": true,
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Model Evaluation Imports and Visualization\n",
+ "from matplotlib import pyplot as plt\n",
+ "!pip install graphviz\n",
+ "!conda install -c anaconda graphviz -y\n",
+ "import graphviz\n",
+ "# Quantitative metrics of Model performance\n",
+ "from sklearn.metrics import mean_squared_error"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ba92c612-e851-4c76-8551-4bf12daea579",
+ "metadata": {},
+ "source": [
+ "### Step 2) Loading training data and making sure it looks correct"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4f7561be-fa34-479e-95d1-62ef4bcdfc29",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('z9dcLYg65uk', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "002bea86-95cb-4076-b05a-aa34bcbceb15",
+ "metadata": {},
+ "source": [
+ "Now that we have our tools, we can now examine our dataset again. \n",
+ "\n",
+ "Recall that we are missing the last 18 values in the column \"cases per 100,000\", but we still have a big chunk of complete data (40 rows). This chunk of complete information is often referred to as **training data**.\n",
+ "\n",
+ "![Training-Data.jpg](images/Training-Data.jpg)\n",
+ "\n",
+ "**Training data** is a machine learning term that refers to the dataset used to teach our Decision Tree to make the predictions for our missing values using available data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "874fc63a-d033-410b-a080-adce8dd7185c",
+ "metadata": {},
+ "source": [
+ "**A)** Let's start by loading our training data into the notebook:\n",
+ "\n",
+ "The data that you need for these lessons are stored in an AWS S3 bucket. Before you begin these files will need to be copied from the bucket to your notebook using following command. For more information, see [NIH CloudLab's documentation](https://cloud.nih.gov/resources/cloudlab/aws-jumpstart/)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3ea8f894-d160-40e1-b6a4-6f837a7ca963",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Download the data from AWS S3 with aws s3 cp\n",
+ "!aws s3 cp s3://nigms-sandbox/nosi-sfsu/data/ . --recursive"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "caab38bd-bbb1-478b-b17c-8bd15bd07802",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the training data, data used to train the algorithm \n",
+ "S2020_training = pd.read_csv(\"S2020_training.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cbaf2e30-2eaf-4a21-9932-99d7bf8c22ba",
+ "metadata": {},
+ "source": [
+ "**B)** Make sure that your dataset is loaded correctly, it should contain the county names and all the data highlighted in green shown in our last picture:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3eabc08f-2002-4adb-b380-08c26ef968a6",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display the entire dataset \n",
+ "S2020_training"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d2eb357a-859b-446d-bfed-514d0c20f819",
+ "metadata": {},
+ "source": [
+ "**C)** We can sneak a peek at what our first 5 rows look like, if your dataset is too big to be displayed."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1d1b826c-2346-4452-80db-cc34a731856c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display only the first 5 rows of our dataset\n",
+ "S2020_training.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c1fafa09-54f7-422b-9f77-07d816beb038",
+ "metadata": {},
+ "source": [
+ "**D)** Here we can see how many rows and columns the complete dataset actually has. In our example we should have (40 rows, 11 columns)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "af0570f1-4334-4be4-90fc-08aa4c425cb0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display only the number of rows (not including the title of the columns) and number of colums of our dataset\n",
+ "S2020_training.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5e1e56f1-3454-4d23-8654-44d4a0d29f81",
+ "metadata": {},
+ "source": [
+ "### Step 3) Separate the training dataset into features and labels"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b261c100-55d5-4d4d-b052-16d033395e30",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('qh8C0QRECWU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "435f149e-a9fe-4b6b-be5d-3033fb6d84d0",
+ "metadata": {},
+ "source": [
+ "Recall that a Decision Tree is a **supervised** machine learning model, therefore we need to specify clearly what we are trying to predict.\n",
+ "\n",
+ "To do this we need to divide the training data into **labels** and **features**\n",
+ "\n",
+ "![Label-and-Features.jpg](images/Label-and-Features.jpg)\n",
+ "\n",
+ "- The RED outlined column is called a **LABEL**. This is a machine learning term that refers to the data that our model will learn to predict.\n",
+ "\n",
+ "- The BLUE outlined columns are called **FEATURES**, which is the term that refers to the columns we would like to use to predict our chosen LABEL. \n",
+ "\n",
+ "Because the **training data** is complete, we can clearly separate LABEL from FEATURES. Remember that the training data is only the red and blue shaded regions of our dataset. \n",
+ "\n",
+ "We can ignore the rest of the dataset for now.\n",
+ "\n",
+ "**A)** Separate the training data into features and labels:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7feef3b4-b0b8-41d0-b4b1-c1294753bd3f",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# On the other hand the label will only include summer_2020 cases per 100 000\n",
+ "S2020_training_labels = S2020_training[\"cases_per_100000\"]\n",
+ "\n",
+ "# Notice that in this code we are droping the \"county\" column, because it does not contribute with our predictions and \"cases_per_100000\" because that is our label\n",
+ "S2020_training_features = S2020_training.drop(columns=[\"county\",\"cases_per_100000\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "775e163b-ac56-40c8-a12c-89f13bfbce30",
+ "metadata": {},
+ "source": [
+ "**B)** Run the **LABEL** to check that the separation was correctly performed (you should see 40 rows and just 1 column):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "49d72928-e153-4617-808d-9fb0140ef3dd",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This code allows you to see what the labels look like as a dataframe, after being separated from the training data\n",
+ "S2020_training_labels = pd.DataFrame(S2020_training_labels,columns = [\"cases_per_100000\"])\n",
+ "\n",
+ "# This code tells you how many rows and columns this dataset has\n",
+ "S2020_training_labels.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "05b69eb8-d593-453b-9786-432dd425c612",
+ "metadata": {},
+ "source": [
+ "**C)** Run the **FEATURES** to check that the separation was correctly performed (you should see all 40 rows and 9 columns only since we dropped the columns of \"county\" and \"cases_per_100000\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1a75f401-1966-4628-b4fc-ee285bc8a595",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This code shows\n",
+ "S2020_training_features.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3c55a9ca-5d66-4bd2-aaa5-82ce6ee305a6",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "display_quiz('quiz_files/quiz2.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6a76b309-f312-425f-a4ac-e2ef7dacbfe0",
+ "metadata": {},
+ "source": [
+ "### Step 4) Create a Decision Tree object and train it"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d8b5abf2-94fe-4e2d-b24a-3b185b08c88e",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('M6gY_JywOys', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e82423a6-36cc-4898-aacb-727f98e9ffc1",
+ "metadata": {},
+ "source": [
+ "After separating our training data into features and labels, we can now create a Decision Tree. \n",
+ "\n",
+ "**A)** Create a Decision Tree object"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1af78ddd-dc00-4298-9732-94684886a7b7",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This line creates the Decison Tree with your chosen specifications (what is written within the parentheses)\n",
+ "dtr_summer2020 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9a337b10-ecaf-4b2e-921f-5b830ac4f7a2",
+ "metadata": {},
+ "source": [
+ "**B)** Train our Decision Tree using the training data we separated in the previous step"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ac3b1e35-0609-4f64-bac6-c42537cb3694",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This line trains the decision tree using both the features and the label from our training data\n",
+ "dtr_summer2020 = dtr_summer2020.fit(S2020_training_features,S2020_training_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dbaec858-5a15-4727-b9fb-ac1d6bd08fd8",
+ "metadata": {},
+ "source": [
+ "### Step 5) Visualize our trained Decision Tree"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c8c03608-f9d7-43d0-abd3-757b6196023e",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('cFk6vmfU48w', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "96d80ff1-924c-4aef-ad9d-68634911deb9",
+ "metadata": {},
+ "source": [
+ "Visualize our Decision Tree by graphing it using the following code "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "68789480-9f14-43ab-b49a-145bf462b132",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Initialize tree data object \n",
+ "dtr_summer2020_dot = tree.export_graphviz(dtr_summer2020, out_file=None, \n",
+ " feature_names=S2020_training_features.columns, \n",
+ " filled=False, rounded=True, impurity=False)\n",
+ "\n",
+ "# Draw graph\n",
+ "dtr_graph = graphviz.Source(dtr_summer2020_dot, format=\"png\") \n",
+ "dtr_graph"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4b6af3a1-3ba0-4759-9883-6d03123548ae",
+ "metadata": {},
+ "source": [
+ "### Let's try to understand what our tree learned!\n",
+ "\n",
+ "- **NODES** contain the decision that must be made based on a particular criteria. You can see that nodes have 2 arrows pointing away from them. All arrows to the LEFT are taken when the criteria is satisfied, and all arrows to the RIGHT are taken when this criteria is not satisfied.\n",
+ "\n",
+ "- **ROOT NODE**, this node is what our model determined as the most important feature to consider when making our predictions. It tells you the feature that best splits the data and it's located at the top of the tree.\n",
+ "\n",
+ "- **LEAVES** contain the final outcome of the decision path. You can see that leaves do not have arrows pointing away from them."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9dd48904-36b7-4747-b9e9-8b6c985ff049",
+ "metadata": {},
+ "source": [
+ "### Step 6) Make predictions using Testing data with our trained Decision Tree"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5d0d1906-bc34-4a59-b05f-99167175ac7c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('LtD93dB5JzU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c56def25-4c94-4de1-a0a8-02b5075988eb",
+ "metadata": {},
+ "source": [
+ "We are now ready to make predictions for the counties that had the missing labels.\n",
+ "\n",
+ "**Below is an image showing what constitutes the testing data in our example**\n",
+ "\n",
+ "![Testing-Data.jpg](images/Testing-Data.jpg)\n",
+ " \n",
+ "In machine learning we usually call the part of the dataset that only contains the FEATURE columns as **testing data**. \n",
+ "\n",
+ "The **testing data** is the dataset that is used to predict the missing values of the LABEL column, based on the rules learned during the training phase.\n",
+ "\n",
+ "Recall that our Decision Tree model has only been taught using the training data (40 counties) and has never seen any of the columns of the testing data (18 counties)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "20da60cd-48c3-4acb-be33-dc4565d3947a",
+ "metadata": {},
+ "source": [
+ "**A)** Let's load the testing data that correspond to the counties with the missing label and see what it looks like."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d5c75c05-745c-4c0c-bf31-c839e43c9ed5",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the testing data features, features = data used to make a prediction\n",
+ "S2020_testing_features = pd.read_csv(\"S2020_test_features.csv\")\n",
+ "\n",
+ "# This lets you see the loaded testing data \n",
+ "S2020_testing_features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "046ef181-0158-4fb3-a726-42046fa4e713",
+ "metadata": {},
+ "source": [
+ "**B)** Lets drop the county names from the dataset and make our predictions!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c9062cad-cde4-4596-a7e5-08203fd77dd7",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This drops the \"county\" column from our test dataset\n",
+ "S2020_features_test_nocounty = S2020_testing_features.drop(columns=[\"county\"])\n",
+ "\n",
+ "# This uses the tree we created and makes the predictions\n",
+ "S2020_labels_pred = dtr_summer2020.predict(S2020_features_test_nocounty)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "33e5fb42-21b8-4983-ade9-83182ef070a4",
+ "metadata": {},
+ "source": [
+ "**C.1)** Let's look at what labels our model predicted and check how it relates to our Decision Tree:\n",
+ "\n",
+ "![COVID-Decision-Tree.PNG](images/COVID-Decision-Tree.PNG)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2b43dd45-625b-4495-9cb8-2bd08e459d75",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This turns our predictions (which is currently an array) into a dataframe \n",
+ "S2020_labels_preds_df = pd.DataFrame(S2020_labels_pred, columns=[\"Predicted\"])\n",
+ "\n",
+ "# This line adds the county name back, so that you can see what was predicted for each county\n",
+ "S2020_labels_preds_df = pd.concat([S2020_testing_features[\"county\"].reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# This lets us see what was predicted\n",
+ "S2020_labels_preds_df.round(3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9c5c9ddc-f972-424a-baaf-e5876ff0d847",
+ "metadata": {},
+ "source": [
+ "**C.2)** Why did the model predict 702.806 for San Francisco County?\n",
+ "\n",
+ "Run the cell bellow and look at the output, follow the tree as described in the video to see that this county has: \n",
+ "- Unemployment Rate =< 0.123\n",
+ "- Population > 28453.0\n",
+ "- Green_votes_percentage > 0.005\n",
+ "\n",
+ "Feel free to try another county and check for yourself that it follows these rules, by changing the county name in the code below:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ff7dd8fa-a9b3-4570-923a-882f198d53b3",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Loading the testing features for San Francisco County\n",
+ "S2020_testing_features[S2020_testing_features['county']=='San Francisco'] # change 'San Francisco' to any other county in the list above that you are interested in"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f295afb6-142f-4d8e-9ef3-5658916f3624",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "display_quiz('quiz_files/quiz3.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "aff22ea7-270b-47dd-ba7f-a2a62ba1cfe5",
+ "metadata": {},
+ "source": [
+ "### Step 7) Let's see how our Decision Tree model performed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9a11c706-b8ba-4b5c-bfde-eed545a00248",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('0VK4sLz2wrc', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a9d8738a-f8b2-4679-9ed5-d7748aaf6bf9",
+ "metadata": {},
+ "source": [
+ "Now that we have predicted the missing labels for Summer 2020 cases, let's see how our model did by comparing it with the actual labels!\n",
+ "\n",
+ "**A)** Let's reveal now our ACTUAL labels by loading them into the notebook"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3cd40255-1bd5-4d11-8f6f-1b9bd29718c0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the testing data labels, label = what we want to predict\n",
+ "S2020_testing_labels = pd.read_csv(\"S2020_test_labels.csv\")\n",
+ "\n",
+ "# This drops the county on our label data so that the dataframe only has one column with county names when is joined with the predicted dataframe\n",
+ "S2020_testing_labels = S2020_testing_labels.drop(columns=[\"county\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4037a417-05e3-43e5-b63e-a3562da3c44b",
+ "metadata": {},
+ "source": [
+ "**B)** We can use a bar graph to help us visually inspect how our model performed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e7cc9c31-b0cd-4bd6-a544-4c0b143d50c8",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This puts into a single dataframe our predictions with our original test labels \n",
+ "pred_vs_test_2020 = pd.concat([S2020_testing_labels.reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# Reorganize the order of columns\n",
+ "pred_vs_test_2020 = pred_vs_test_2020.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "\n",
+ "# This plots the data in a barchart per county\n",
+ "pred_vs_test_plot = pred_vs_test_2020.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"darkblue\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0ae29213-1635-4a51-9593-20a59bf2a60e",
+ "metadata": {},
+ "source": [
+ "### Step 8) Let's try using our Summer 2020 tree model to predict 2021 data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fee74988-b620-4941-89f1-1d6ebb5ef0cc",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('2r3ZpwM6xDQ', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e6d838dc-e9fe-4721-b866-07068f5126b3",
+ "metadata": {},
+ "source": [
+ "**A)** Let's load the features information for the same 18 counties, but this time for Summer 2021."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3712f5eb-75ad-4f96-9106-9b51a19c0831",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Importing Summer 2021 data to predict using \"Summer2020 Model\"\n",
+ "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
+ "\n",
+ "# Make predictions for Summer 2021 Data\n",
+ "S2021_labels_pred = dtr_summer2020.predict(S2021_testing_features.drop(columns=[\"county\"]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1c8d062d-42b2-419c-a703-a3a4537fdfb0",
+ "metadata": {},
+ "source": [
+ "**B)** Let's now load the actual Summer 2021 data and see how our 2020 Decision Tree model performed this time."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7212fb89-7ff6-488c-907e-0177ccacb6bf",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Importing labels of Summer 2021 data to check accuracy of \"Summer2020 Model\" predicting Summer2021 Data\n",
+ "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
+ "\n",
+ "# This turns our predictions (which is currently an array) into a dataframe \n",
+ "S2021_labels_preds = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
+ "\n",
+ "# This puts into a single dataframe our predictions with our original test labels \n",
+ "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# Visualize performance for Summer 2021 predictions\n",
+ "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5cc98d6b-7118-4d3f-a144-a8b8e55d2cde",
+ "metadata": {},
+ "source": [
+ "**C)** Another way to look at the difference in performance between predictions made by the model for 2020 vs 2021 data is to observe their difference in errors.\n",
+ "\n",
+ "We can see that for 2020 the histogram (Blue) of errors is closer overall to 0 ranging from -500 to 500, whereas the histogram of errors for 2021 (Orange) are all over the place ranging from -1000 to 2500"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "332a67fd-30ec-4acd-bba3-7ce9855a3786",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Create columns holding error between actual rate vs. predicted rate\n",
+ "pred_vs_test_2020['residual'] = pred_vs_test_2020['cases_per_100000'] - pred_vs_test_2020['Predicted']\n",
+ "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
+ "\n",
+ "# Plot errors on histogram\n",
+ "plt.title('Cases per 100k Prediction Errors')\n",
+ "plt.hist(pred_vs_test_2020['residual'], alpha=0.5, label='2020 data')\n",
+ "plt.hist(pred_vs_test_2021['residual'], alpha=0.5, label='2021 data')\n",
+ "plt.legend(loc='upper right')\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "78323de1-ec64-43d3-a21d-cf93d97db995",
+ "metadata": {},
+ "source": [
+ "**D)** A more formal way to calculate the performance for the model is to calculate the Root Mean Square Error (RMSE). Feel free to browse the **(Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data** for more details about this particular metric."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "374a7f45-8bea-4b97-9ba5-12eec94e34a5",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2020 Data\n",
+ "print(f\"RMSE on 2020 test set: {mean_squared_error(pred_vs_test_2020['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2a4fe185-6b1c-47e0-ba16-4a181d47661c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2020 Data\n",
+ "print(f\"RMSE on 2021 test set: {mean_squared_error(pred_vs_test_2021['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ef11332-1228-4fe5-97f0-a4fb5ae924bf",
+ "metadata": {},
+ "source": [
+ "#### Please run the additional cell below to save a csv copy of the predicted and actual values made by our 2020 model for the years (2020 and 2021)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e2ba12ba-dfb4-4b44-98c0-e2fdb0226a86",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Lets's save our comparison dataframes as a CSV file for a more quantitative analysis\n",
+ "# We will revisit these in the next notebook.\n",
+ "pred_vs_test_2020.to_csv('Model2020pred_vs_test_2020.csv', encoding='utf-8',index=False)\n",
+ "pred_vs_test_2021.to_csv('Model2020pred_vs_test_2021.csv', encoding='utf-8',index=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3232e1d1-6190-4dcc-8747-621fd78ccbfd",
+ "metadata": {},
+ "source": [
+ "# Conclusion\n",
+ "\n",
+ "Congratulations on completing the \"Introduction to Machine Learning: Decision Trees\" module! Throughout this noteboook, you have gained a foundational understanding of Decision Trees, a fundamental machine learning technique. By working with real-world COVID-19 data from California, you have learned how to: \n",
+ "\n",
+ "1. **Understand Decision Trees:** \n",
+ " Recognize how Decision Trees function as supervised machine learning models, making predicitons based on learned decision rules.\n",
+ "2. **Prepare Data:** Load, inspect, and preprocess data to separate it into features and labels, crucial steps for training machine learning models. \n",
+ "3. **Train a Model:** \n",
+ " Create and train a Decision Tree model using the training dataset. \n",
+ "4. **Visualize and Interpret:**\n",
+ " Visualize the trained Decision Tree and understand the decision-making process at each node. \n",
+ "5. **Make Predictions**\n",
+ " Use the trained model to predict missing vlaues in the dataset and evaluate its performance. \n",
+ "6. **Evalue Performance:**\n",
+ " Compare the model's predictions to actual values using visualizations and quantitative metrics such as Root Mean Square Error (RMSE). \n",
+ " \n",
+ "This module has equipped you with the skills to apply Decision Trees to various datasets and understand their potential and limitations. Remember, Decision Trees are the foundation for more complex models like Boosted Trees and Random Forests. \n",
+ "\n",
+ "We hope you found this module informative and engaging. Keep experimenting with different datasets and machine learning techniques to further enhance your skills. Happy learning! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5f15de83-339b-4d21-94c2-9cd5a5be6a92",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Clean up \n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b473022e-a62b-4f82-a7cf-565dd2691168",
+ "metadata": {},
+ "source": [
+ "## *Acknowledgments*\n",
+ "\n",
+ "This notebook was created by Lucy Moctezuma Tan, Florentine van Nouhuijs, Lorena Benitez-Rivera (SFSU master's students and CoDE lab members), and Pleuni Pennings (SFSU bio professor).\n",
+ "Special acknowledgment to Faye Orcales for pulling the COVID data tables from government websites.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m114",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m114"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/AWS/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb b/AWS/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
new file mode 100644
index 0000000..9387afe
--- /dev/null
+++ b/AWS/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
@@ -0,0 +1,563 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "8807397b-a776-4baf-b739-178e87bfbfa4",
+ "metadata": {},
+ "source": [
+ "# **Quantitative Comparison of 2020 Decision Tree Model using (2020 vs 2021) Data Features**\n",
+ "\n",
+ "# Overview \n",
+ "Let's take a look at the bar graphs we have for both Decision Tree models (2020 vs 2021) created in the previous notebook:\n",
+ "At first glance we can see that in general the model had more accurate predictions in 2020 than in 2021, more of the yellow bars (predictions) are similar to the blue ones (2020 data); whereas in 2021, most of the yellow bars (predictions) are different from the teal colored bars (2021 data). \n",
+ "\n",
+ "In this notebook we will further explore these differences, this time using metrics and other graphs to show you why and how our Summer 2020 model differed in it's performance.\n",
+ "\n",
+ "![Model-performance-comparison.jpg](images/Model-performance-comparison.jpg)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "796c3b6f-919e-4225-bcf6-b8a6648e4e62",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives\n",
+ "\n",
+ "- Compare Model Performance\n",
+ " - Evalue the performance of a decision tree model using data fom different years (2020 vs. 2021).\n",
+ " - Interpret bar graphs and other visualizations to understand model accuracy. \n",
+ "- Calculate and Interpret RMSE\n",
+ " - Calculate the Root Mean Square Error (RMSE) to quantify model performance. \n",
+ " - Compare RMSE values to determine which dataset the model performs better on. \n",
+ "- Analyze Data Differences \n",
+ " - Examine summary statistics and distributions of training data drom differnt years.\n",
+ " - Identify how changes in data features (e.g., vaccination rates, unemployment rates) impact model performance.\n",
+ "- Understand Correlations and Trends \n",
+ " - Analyze correlations between features and the target variable (cases per 100K) for different years. \n",
+ " - Use scatterplots and trend lines to visually inspect relationships between variables. \n",
+ "- Identify Causes of Model Performance Changes\n",
+ " - Understand the concept of data drift and how it affects model performance over time. \n",
+ " - Recognize the importance of retraining models to adapt to changes in data distibutions and relationships. \n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dcfb00d9-4ee7-4a9c-8e0b-2b1c02420b5e",
+ "metadata": {},
+ "source": [
+ "# Prerequisites "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "364c2871-9d2b-4ee5-9b32-c1c0ab4d1bfa",
+ "metadata": {},
+ "source": [
+ "***Modules*** \n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "- Model2020pred_vs_test_2020.csv (module 1)\n",
+ "- Model2020pred_vs_test_2021.csv (module 1)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "65a13b55-467d-4c48-93d2-2f9e71c5752d",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Get Started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dd0b4ee2-8014-433f-bf5e-4c6bb820000b",
+ "metadata": {},
+ "source": [
+ "### Step 1) Import libraries needed to examine the differences in performance of Summer 2020 model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "4dfbffc7-aa04-4703-aacf-c8f7e0da70dd",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "Matplotlib is building the font cache; this may take a moment.\n",
+ "/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/seaborn/_statistics.py:32: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.22.4)\n",
+ " from scipy.stats import gaussian_kde\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Data Wrangling Imports\n",
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "\n",
+ "# Model Evaluation Imports and Visualization\n",
+ "from matplotlib import pyplot as plt\n",
+ "from matplotlib.lines import Line2D\n",
+ "import seaborn as sns\n",
+ "\n",
+ "# Quantitative metrics of Model performance\n",
+ "from sklearn.metrics import mean_squared_error"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3a60d7c5-ae36-4dc6-b3d9-abef5e2f44a3",
+ "metadata": {},
+ "source": [
+ "### Step Step 2) Let's load the dataframes showing the differences between actual data and predictions for both years\n",
+ "\n",
+ "**A)** Actual vs Predictions for 2020 data made by our model \n",
+ "\n",
+ "By loading our CSV files, we can see what our bar charts were actually plotting. Recall that the **BLUE** bars were the Actual cases_per_100K in 2020 and the **YELLOW** bars were the values predicted by our Summer 2020 Decision Tree model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c01f90c0-13b3-4bda-ac1a-bfb246626037",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the training data, data used to train the algorithm \n",
+ "pred_vs_test_2020 = pd.read_csv(\"Model2020pred_vs_test_2020.csv\")\n",
+ "pred_vs_test_2020 = pred_vs_test_2020[[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "pred_vs_test_2020"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5a7732d9-13ad-47fc-bc8f-84b20c385ea3",
+ "metadata": {},
+ "source": [
+ "**B)** Actual vs Predictions for 2021 data made by our model \n",
+ "\n",
+ "Similarly, the **GREEN** bars were the Actual cases_per_100K in 2021 and the **YELLOW** bars were the values predicted by our Summer 2020 Decision Tree model "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6b0b9e6a-e3bd-4d6a-a0eb-a44938ba052a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the training data, data used to train the algorithm \n",
+ "pred_vs_test_2021 = pd.read_csv(\"Model2020pred_vs_test_2021.csv\")\n",
+ "pred_vs_test_2021 = pred_vs_test_2021[[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "pred_vs_test_2021"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2dcf3bc7-9336-4dd6-ac8f-0178b3fa55ff",
+ "metadata": {},
+ "source": [
+ "### Step Step 3) Calculating the Root Mean Square Error (RMSE)\n",
+ "\n",
+ "**Root Mean Square Error (RMSE):** is a measurement that shows us how far apart our Predicted values are from our Actual values on average. The lower it is the better the performance of the model. The RMSE allows us to see the average error using the units of our label (Cases_per_100K).\n",
+ "\n",
+ "$$\n",
+ "RMSE = \\sqrt{\\frac{\\sum\\limits _{i=1} ^{N}(Predicted_{i} - Actual_{i})^{2}}{N}}\n",
+ "$$\n",
+ "N = Number of observations\n",
+ "\n",
+ "We will calculate this metric for both set of predictions made by our 2020 Summer Decision Tree model:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "13dc4c89-87c4-4796-92d4-d7228d2e2364",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2020 data\n",
+ "print(f\"RMSE on 2020 test set: {mean_squared_error(pred_vs_test_2020['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "03d21a5e-3242-4de3-899c-3871e51dcabf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2021 data\n",
+ "print(f\"RMSE on 2021 test set: {mean_squared_error(pred_vs_test_2021['cases_per_100000'], pred_vs_test_2021['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d95d26f4-b1de-4fd2-81ca-552907dfb9ac",
+ "metadata": {},
+ "source": [
+ "As you can observe our 2020 Summer Decision Tree model performed better using 2020 data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "74e7e755-cfe7-42be-87ee-aeeadc7f67e6",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "### Step Step 4) Why do you think the accuracy of the model decreased from year 2020 to year 2021?\n",
+ "\n",
+ "Machine learning models depend highly on the data they were trained on, therefore to understand why the performance decreased, we can take a look at the differences between the training data from year 2020 versus the one from 2021 \n",
+ "\n",
+ "**A)** Let's load training data for both years once again to take a look at their metrics"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "eea88323-ac6a-48ec-b629-11871ad913c2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This opens the files that contains the training data, data used to train the algorithm \n",
+ "S2020_training= pd.read_csv(\"S2020_training.csv\")\n",
+ "S2021_training= pd.read_csv(\"S2021_training.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e3748946-da76-429e-8553-ac85e4feba50",
+ "metadata": {},
+ "source": [
+ "**B)** Taking a look at their summary statistics"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "67cbd88d-2d4b-4063-8002-19b3e9cad51c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# View basic summary statistics of training set from 2020\n",
+ "S2020_training.describe()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0a680f78-4f6d-43fc-aeff-30db0cc16d47",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# View basic summary statistics of training set from 2021\n",
+ "S2021_training.describe()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b017de0c-e1ee-4fa3-9ca0-a62fea83833f",
+ "metadata": {},
+ "source": [
+ "**C)** Taking a look at the distributions between our label: cases_per_100K; and predictors: Unemployment_Rate and Fully_vaccinated_percent for both years\n",
+ "\n",
+ "By running the code below we can observe that the **Unemployment Rate** was held constant from 2020 to 2021, however, the **Fully Vaccinated percentage** is very different. We can see that in 2020 the vaccination rates were basically non-existent, whereas in 2021 there were more people being vaccinated, as Dr. Pleuni mentioned in her last video. Also, we can observe that the number of cases_per_100K are overall higher in 2021; this could be due to cases being more often reported by people and/or easier to identify given new public health measures."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a8e5da8d-b994-4891-a3da-05aeb1ef830b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# The following code creates a composite graph comparing our original root node variable (unemployment_rate) to our number of cases (target label)\n",
+ "fig, ax = plt.subplots(nrows=2, ncols=3,figsize=(30,15)) \n",
+ "\n",
+ "# graphs the histogram for upperLeft corner\n",
+ "ax[0,0].hist(S2020_training['2020_Unemployment_Rate'], bins=20, color=\"darkblue\")\n",
+ "ax[0,0].set_title(\"2020 Unemployment Rate\", size=25)\n",
+ "\n",
+ "# graphs the histogram for upperMiddle area\n",
+ "ax[0,1].hist(S2020_training['fully_vaccinated_percent'], bins=20, color=\"darkblue\")\n",
+ "ax[0,1].set_title(\"2020 Fully Vaccinated Percent\", size=25)\n",
+ "\n",
+ "# graphs the histogram for upperRight corner\n",
+ "ax[0,2].hist(S2020_training['cases_per_100000'], bins=20, color=\"darkblue\")\n",
+ "ax[0,2].set_title(\"2020 Cases Per 100K\", size=25)\n",
+ "\n",
+ "# graphs the histogram for lowerLeft corner\n",
+ "ax[1,0].hist(S2021_training['2020_Unemployment_Rate'], bins= 20, color=\"teal\")\n",
+ "ax[1,0].set_title(\"2021 Unemployment Rate\", size=25)\n",
+ "\n",
+ "# graphs the histogram for lowerMiddle area\n",
+ "ax[1,1].hist(S2021_training['fully_vaccinated_percent'], bins=20,color=\"teal\")\n",
+ "ax[1,1].set_title(\"2021 Fully Vaccinated Percent\", size=25)\n",
+ "\n",
+ "# graphs the histogram for lowerRight corner\n",
+ "ax[1,2].hist(S2021_training['cases_per_100000'], bins=20, color=\"teal\")\n",
+ "ax[1,2].set_title(\"2021 Cases Per 100K\", size=25)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "911b1a31-77a1-46af-9dfa-32c59bd2b656",
+ "metadata": {},
+ "source": [
+ "**D)** Taking a look at the variable correlations between 2020 vs 2021\n",
+ "\n",
+ "**Correlation:** is a metric that shows how strongly related two variables are. Correlation values range from -1 to 1. There are mainly two components to judge when we look at correlations:\n",
+ "\n",
+ "General guidelines to judge the **direction or slope** of these relationship:\n",
+ "- If a correlation value is close to 0, then the variables are not very related to each other. \n",
+ "- If a correlation value is close to 1, it means a positive correlation, thus the higher one variable is, the higher the other variable will also tend to be.\n",
+ "- If a correlation value is close to -1, it means a negative correlation, thus the higher one variable is, the lower the other variable will tend to be.\n",
+ "\n",
+ "General guidelines to judge the **strength** of these relationship:\n",
+ "- values from (-0.3 to +0.3) are generally considered a weak correlation\n",
+ "- values from (-0.5 to -0.3 AND 0.3 to 0.5) are generally considered a moderate correlation\n",
+ "- values from (-1.00 to -0.5 AND 0.5 to 1.00) are generally considered a strong correlation\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c46d0e66-f226-4eac-b484-a2f1db554db8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This code calculates the correlations within 2020 and 2021 data variables\n",
+ "correlations_2020 = S2020_training.corr().round(2)\n",
+ "matrix2020 = np.triu(np.ones_like(correlations_2020))\n",
+ "correlations_2021 = S2021_training.corr().round(2)\n",
+ "matrix2021 = np.triu(np.ones_like(correlations_2021))\n",
+ "\n",
+ "# The following creates a composite graph showcasing the correlation charts between both years\n",
+ "fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(20,10), sharey=True, sharex=True)\n",
+ "g1 = sns.heatmap(correlations_2020,cmap=sns.light_palette(\"darkblue\",n_colors=7),annot=True,annot_kws={\"fontsize\":14},cbar=True,ax=ax1, mask=matrix2020, linewidths=4)\n",
+ "g1.set_title(\"2020 Training Data Correlations\", size=25)\n",
+ "g2 = sns.heatmap(correlations_2021,cmap=sns.light_palette(\"teal\",n_colors=7),annot=True,annot_kws={\"fontsize\":14},cbar=True,ax=ax2, mask=matrix2021, linewidths=4)\n",
+ "g2.set_title(\"2021 Training Data Correlations\", size=25)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bbfd5d04-6748-41d9-abd0-325e6ac0ad3e",
+ "metadata": {},
+ "source": [
+ "- We can see that in 2020 we obtained moderate correlations between cases_per_100k and **Unemployment Rate (0.48)** , **Green Votes Percent (-0.49)**, **Libertarian Votes Percent (-0.42)** and **Other Votes Percent (0.42)**. The rest were weakly related to our target label.\n",
+ "\n",
+ "- We can see that in 2021 we obtained stronger correlations between cases_per_100k and **Fully Vaccinated Percentage (-0.61)** , moderate ones with **Democrat Votes Percent(-0.54)** and **Republican Votes Percent(0.53)** and a weak moderate one with **Libertarian Votes percent (0.34)**\n",
+ "\n",
+ "\n",
+ "**IMPORTANT NOTE:** Recall that Dr. Pleuni reminded us in her video that these variables can only give us a correlation to Cases per 100K, they **DO NOT indicate Causality**, they only show us in general what variables are better at predicting our target label. Remember that there are a myriad of reasons that are beyond the variables noted in our exercise, for example we have not accounted for mask mandates in each county, average age of people accounted for on each county, public transportation use, at-home versus on-site working conditions, level of education, etc. Also note that no one variable will be a perfect predictor, instead it is likely that several variables contribute to the overall prediction success of the model.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6e903dbf-03b3-47c2-b652-825d2a296e0e",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "**E)** Creating scatterplots and trend lines to visualize relationships\n",
+ "\n",
+ "We can run the code below to visually inspect the trend lines between the cases_per_100K for each year and see these relationships graphically.\n",
+ "\n",
+ "**1)** Below we can see the scatterplots and trend line formed using the variables we found worth noting after looking at our 2020 correlation heatmap"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e3c90373-e11a-4f05-9fc1-f6805c576cf9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Visualize relationship between some columns and cases_per_100k with scatterplots, with best fit line for 2020\n",
+ "fig, (ax1,ax2,ax3,ax4)= plt.subplots(ncols=4, figsize=(24, 6), sharey = True) # Plots will be in 1 row and 4 columns\n",
+ "fig.suptitle('Relationships to cases_per_100k for relevant correlations on 2020 for both years', fontsize=16)\n",
+ "\n",
+ "# 2020 data relationship plots\n",
+ "\n",
+ "# Graphing the plot showing relationship between cases_per_100k and Unemployment Rate\n",
+ "g1 = sns.regplot(data=S2020_training, x='2020_Unemployment_Rate',y='cases_per_100000', ax=ax1, color=\"darkblue\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and green_votes_percent\n",
+ "g2 = sns.regplot(data=S2020_training, x='green_votes_percent', y='cases_per_100000', ax=ax2, color=\"darkblue\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and libertarian_votes_percent\n",
+ "g3 = sns.regplot(data=S2020_training, x='libertarian_votes_percent', y='cases_per_100000', ax=ax3, color=\"darkblue\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and other_votes_percent\n",
+ "g4 = sns.regplot(data=S2020_training, x='other_votes_percent', y='cases_per_100000', ax=ax4, color=\"darkblue\")\n",
+ "\n",
+ "\n",
+ "# 2021 data relationship plots\n",
+ "\n",
+ "# Graphing the plot showing relationship between cases_per_100k and Unemployment Rate\n",
+ "h1 = sns.regplot(data=S2021_training, x='2020_Unemployment_Rate',y='cases_per_100000', ax=ax1, color=\"teal\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and green_votes_percent\n",
+ "h2 = sns.regplot(data=S2021_training, x='green_votes_percent', y='cases_per_100000', ax=ax2, color=\"teal\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and libertarian_votes_percent\n",
+ "h3 = sns.regplot(data=S2021_training, x='libertarian_votes_percent', y='cases_per_100000', ax=ax3, color=\"teal\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and other_votes_percent\n",
+ "h4 = sns.regplot(data=S2021_training, x='other_votes_percent', y='cases_per_100000', ax=ax4, color=\"teal\")\n",
+ "\n",
+ "# Creating legend\n",
+ "custom_lines = [Line2D([0], [0], color=\"teal\", lw=4), Line2D([0], [0], color=\"darkblue\", lw=4)]\n",
+ "ax1.legend(custom_lines, ['In 2021', 'In 2020'], loc=\"upper left\",prop={'size': 14})"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "197ae0bb-6af2-4771-8729-050594b3dd2a",
+ "metadata": {},
+ "source": [
+ "**We can see that the relationships that were strong or in a particular direction in 2020 are not the same in 2021**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c162f1ff-2b96-43a8-8858-14544fc431c7",
+ "metadata": {},
+ "source": [
+ "**2)** Below we can see the scatterplots and trend line formed using the variables we found worth noting after looking at our 2021 correlation heatmap, we will not plot the Libertarian vote percentage as it was plotted already in the example above."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "34a00e13-095e-4512-b5b2-edb8d4501283",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Visualize relationship between some columns and cases_per_100k with scatterplots, with best fit line for 2021\n",
+ "fig, (ax1,ax2,ax3)= plt.subplots(ncols=3, figsize=(24, 7), sharey = True) # # Plots will be in 1 row and 3 columns\n",
+ "fig.suptitle('Relationships to cases_per_100k for relevant correlations on 2021 for both years', fontsize=16)\n",
+ "\n",
+ "# 2020 data relationship plots\n",
+ "\n",
+ "# Graphing the plot showing relationship between cases_per_100k and fully_vaccinated_percent\n",
+ "g1 = sns.regplot(data=S2020_training, x='fully_vaccinated_percent',y='cases_per_100000', ax=ax1, color=\"darkblue\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and democrat_votes_percent\n",
+ "g2 = sns.regplot(data=S2020_training, x='democrat_votes_percent', y='cases_per_100000', ax=ax2, color=\"darkblue\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and republican_votes_percent\n",
+ "g3 = sns.regplot(data=S2020_training, x='republican_votes_percent', y='cases_per_100000', ax=ax3, color=\"darkblue\")\n",
+ "\n",
+ "\n",
+ "# 2021 data relationship plots\n",
+ "\n",
+ "# Graphing the plot showing relationship between cases_per_100k and fully_vaccinated_percent\n",
+ "h1 = sns.regplot(data=S2021_training, x='fully_vaccinated_percent',y='cases_per_100000', ax=ax1, color=\"teal\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and democrat_votes_percent\n",
+ "h2 = sns.regplot(data=S2021_training, x='democrat_votes_percent', y='cases_per_100000', ax=ax2, color=\"teal\")\n",
+ "# Graphing the plot showing relationship between cases_per_100k and republican_votes_percent\n",
+ "h3 = sns.regplot(data=S2021_training, x='republican_votes_percent', y='cases_per_100000', ax=ax3, color=\"teal\")\n",
+ "\n",
+ "# Creating legend\n",
+ "custom_lines = [Line2D([0], [0], color=\"teal\", lw=4), Line2D([0], [0], color=\"darkblue\", lw=4)]\n",
+ "ax1.legend(custom_lines, ['In 2021', 'In 2020'], loc=\"upper left\",prop={'size': 14})"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3e5f931f-52d2-46a4-be19-d8456fd0be9e",
+ "metadata": {},
+ "source": [
+ "# Conclusion\n",
+ "### Reasons that our 2020 model performed worse when predicting 2021 Data\n",
+ "# Conclusion\n",
+ "### Reasons that our 2020 model performed worse when predicting 2021 Data\n",
+ "\n",
+ "All in all we can see that the training data variables in the years 2020 and 2021 do not have the same relationship with our target variable (cases_per_100k) as showcased by the trend lines we see in our scatterplots. There are different kinds of changes that can happen in real life once our model is deployed, this is why it's important to to retrain our model from time to time. For more information about the different kinds of changes (or drifts) that could occur over time check out this useful [website](https://arize.com/model-drift/?utm_source=google&utm_medium=cpc&utm_campaign=18216725893&utm_content=139136719885&utm_term=data%20drift&utm_term=data%20drift&utm_campaign=Monitor+ML+-+Search&utm_source=adwords&utm_medium=ppc&hsa_acc=9379871348&hsa_cam=18216725893&hsa_grp=139136719885&hsa_ad=620214790996&hsa_src=g&hsa_tgt=kwd-328660210229&hsa_kw=data%20drift&hsa_mt=e&hsa_net=adwords&hsa_ver=3&gclid=CjwKCAjw-L-ZBhB4EiwA76YzOet2ULiqRzKwwxgXsCJhh7NgueokMbk9sBee2XAX4WtP4aaEMxPrIxoCHRsQAvD_BwE).\n",
+ "\n",
+ "We will focus on just one type of drift:\n",
+ "\n",
+ "**Data Drift:** Also called Features Drift or Covariate Drift. This happens when the relationship between the input features first used during model training change. This drift is due to changes in the statistical properties, correlations, and/or data distributions of the features since training. \n",
+ "\n",
+ "\n",
+ "- For instance we can observe in our first set of scatterplots that unemployment rate used to have a more pronounced positive slope in 2020, but then in 2021 this slope was less pronounced, in addition recall that these differences are also seen in other variables shown on both sets of scatterplots (libertarian_vote_percentage, democrat_vote_percentage, etc). Dr. Pleuni mentioned that in Summer 2020 most people were not vaccinated as it was not as available, whereas in 2021 it was more available; we can see this stark contrast in the first scatterplot of the second set because all the points in 2020 are clustered near the 0 percent fully vaccinated percent, whereas in 2021 we see that the lowest percentage is at least above 30% of people fully vaccinated.\n",
+ "\n",
+ "### Now that we have identified the flaws for our 2020 model, let's retrain our Decision Tree Data model using 2021 data in the next notebook!!\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "275552df-52a9-4416-b2ca-065db3bd251d",
+ "metadata": {},
+ "source": [
+ "# Clean up\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "57f1df8d-310a-4a37-b5b0-29a97300239a",
+ "cell_type": "markdown",
+ "id": "275552df-52a9-4416-b2ca-065db3bd251d",
+ "metadata": {},
+ "source": [
+ "# Clean up\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "57f1df8d-310a-4a37-b5b0-29a97300239a",
+ "metadata": {},
+ "source": [
+ "To keep your workspace organized remember to: \n",
+ "\n",
+ "1. Save your work. \n",
+ "2. Close any notebooks and active sessions to avoid extra charges. "
+ ]
+ "source": [
+ "To keep your workspace organized remember to: \n",
+ "\n",
+ "1. Save your work. \n",
+ "2. Close any notebooks and active sessions to avoid extra charges. "
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m108",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/AWS/3- Practice.ipynb b/AWS/3- Practice.ipynb
new file mode 100644
index 0000000..d3867e4
--- /dev/null
+++ b/AWS/3- Practice.ipynb
@@ -0,0 +1,318 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "423e9402-a82b-4834-a3a0-931a2c685ba7",
+ "metadata": {},
+ "source": [
+ "# **Practice: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "302b6319-f7a7-4d82-a9d9-8d1ffb16858f",
+ "metadata": {},
+ "source": [
+ "# Overview \n",
+ "In this module, you will put to practice what you have learned in the ***Introduction to Machine Learning: Decision Trees*** by creating, training, and evaluating a decision tree module using Summer 2021 data, enhancing your ability to adapt machine learning models to new datasets and assess their performance. \n",
+ "\n",
+ "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
+ "\n",
+ "* You DON'T need to copy-paste the comments from the original code (the green text that is preceded by \"#\"). \n",
+ "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 data.\n",
+ "\n",
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "536f933f-a925-4fe9-945b-87c48fb98ecb",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGBoeHRsfIiomIyIhIiolJigoMCkyMC0oLS01PVBCNThLOS0tRWFFS1NWW1xbMkJlbWRYbFBZW1cBERISGRYYLhsbLVc/NT1XV1dXV1dXV1dXV2NXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV11XXVdXV1deXVddV1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAgMEAQUGB//EAEAQAAICAAMFBAcFBwQCAwEAAAABAhEDEiEEEzFBUQVhcZEUIlKBodHwFSMyU7EzQnKSweHxBkNiohayY3PSJP/EABcBAQEBAQAAAAAAAAAAAAAAAAABAgP/xAAfEQEBAQADAQACAwAAAAAAAAAAARECEiExIkEDMmH/2gAMAwEAAhEDEQA/APz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHtYfZMZZqy+rxuT+BN9iKr9Tg3+N8iMYt8FZLCwnJ1FfFLnXFgRxOx1Gry6utJPz8Dq7FtJqMdVf4tfAji+paejTpoi27rI76VqEWy7Eq7jGkm/x9PeRl2OlVpK+HrHHCaV7uVacuvA5Uvy5eTCox7Ni1N5V6it+t+nUv7I7A9LxJwhKMMqu5X1Kmp6fdy1dLR8UIRxM3qRxIt+zcScpbPCIfZDvESyvJNxfrU20+KT5HfsWfJRfhNfXMjuZSf4Jt+DbvyI7r/jLQsCPZjbrS7a/FwaaWr95P7Hl7K/mRBYd/usLD/wCLCJR7JbWijxaazrkc+ynTdR0bT9bXQ48L/iwsPnTCq47EnCU6VRaT1116dTX2b2J6S5qMoxyK3mdX4FO6fsv4HYxktUpL30WZvqX/AByHZbcM6y1dfi14tcPd8UTfYsk69T3TRDdf8X8BuX7LIK8fYd20pLirVO9CrcR6GlYT9mRxw7mBn3Eehds3Z+9bUa0TerrgT3Tq8rORhfCMgqv0JZM9LLmy8dbq+BLA7P3jko16sXJ3JR0XiyW7/wCMju7fsyAzbiPQbiPQ0ShXFNEdO8Ip3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvCqdxHoNxHoXad407wKdxHoNxHoXad407wincR6DcR6GhQ0unS5nVh2nJRlS4sDP6PGr/rqFgR6fE14ez5vw29L9yLcLs+c8RYUYt4jdKKriMN9x572eK/yWw2KLhKd1XLXXhz9/wADT6E87g01JXalpVau78GSxOzpRxN3JVKk+KejVp2u5gtx56wI9DRgbDD0jCw5VJSxIJ03TTkk6fvL9o2F4cssuJZsWz1tOCnx3mG/+yaCTlLNj6P/AMY2P8p/zz+Y/wDGNj/Kf88/me1QoK8X/wAY2P8AKf8APP5nH/pjY/yn/PP5nt0KCvDf+mdj/Kf88/mRf+mtj/K/7z+Z7ckeVt/auR5cOOdri+QXGd/6b2T8r/vL5kH/AKd2T8p/zy+Zs2HtFYukllkacQmpZjyf/H9k/K/7y+Z1f6f2Pnhf95fM0vFZxMD5pacGdhNxdp17j0NtwcOMZKMFFJLLNSbblzjV+PkZNk2jdybyqVqqfCuaM8OfebF58bxZpwT4slKU3LO8STlVX3G3B2+McNQeEm1+9pfHjw4rTyLPtKNVuk9Kt8b9rhVm2WDfYlJb2VLgFjYi4YsjdLtGDb+6StNXfrJutVy0ohPbVJweWKUHdVblrrb4fAKywxsRO94746rnws7HaMVXWLLx5+7zfmzRhbThxgoOCxK/e4PV38ifpuDX7CN1T18Pl4gYli4i4YsiDzPXOze9rwadYC104/H66EfS8PJW6WfKkpXw9WuHmwMDg3+8ztP22ejHbcOvWwszXBOqWnBacNDj2zC/IT0p6/2/wBgSftsisP8A5M9B7XhZ29yqpJK6rjb99kpbZgv/AGF42EedkftsNP22bcPacJRSeDbpK759eHv+B3aNrw5p5cLLbt17/r3hWHK/afkMj9tnoT2vB1rBV6Vfhr46lO2Y8MSScMPJpw+vADM03xmzmV+0y7ExE0lSVfXz8ysIjlftsKDXCbJADnre2xT9tnQBCWHfGTZHcLq/ItAFW4XV+Q3C6vyLQBVuF1+A3C6/AtAFW4XX4DcLq/ItAVVuF1fkNwur8i0AVbhdX5DcLr8C0BFW4XX4DcLr8C0AVbhdfgNwuvwLQBVuF1+A3C6/A9TD2TBez53ifeZZPLnjxTdLLx4alMdli0msWFtK09KdXQGTK8uXM66UFBqLjmdPiq+uiNktjSdPFha5amaSpvnQDDbg04umuDJ4ONKE1OEmpq6kuOvMux9kUIRlni26tKUXVrub7zr2OOv30PMaYpljzeI8Rybm2220nd8bvxOz2mcp55SuVJXS4JUlS7tDmFhKWJGDmoxclFz5JXWbw5l/amywwcXJCTksqeri3F+y8rav5gsZ8bGeI7k9S3YXe0YLb/3If+yNsNjw91ema67+C1v3mLYdNpwv/th/7Irj/F/Jx5bx4z4+6oULOh2co4yRxkV5nbO17rCdOm9EeAsRR4mn/U2Pei/dMOkldcjNdeL1cFp0v3qtMuwu08Obyt5ZdHz8GU7IllhNxpxT17jxNs1jDE9pP9WSX05x7z4nUzHsGJeDDwo0ZjTi+dNGFtTjHLkjJXfrKzOmX4e04azXCLTdpXwXS+JVWen/APxYfH2e6ivB2lwzerGWartWhtO0Yc6yQjCujuyOHjxTi8sXSaa43x11+tALVtzX+3h/ylePj569SMa9lUTxtrwpJqOFGLeiea6M6d8NQNj7RbdvDhxvhr5lONtOdVkiudpa8SjMuqGZdV5gaI7VSSyQdKtVqWenafssO+uX+hjzLqhmXVAXYO0OGb1U1Limiz01/lw/lMuZdRa6gWvHedzSS7louFE57XcXHdwV9EZ8y6oZl1QF+FtTikssXXC1rxsjLGuedRS7lwKsy6oZl1QFzx7d5ESe0r2EZ8y6rzGddUBf6Qq/AvgceMqSyIpzLqvMZl1XmBo9IX5cR6Svy0Z8y6oZl1QEpyt3VHDmZdUMy6oDoOZl1QzLqB0HM66oZl1QHQRc11QzrqgJA5nXVDOuqAWdOZl1XmMy6oDoOZl1QzLqgOg5mXVDMuqAWdOZl1R2M4rjT94HVFvhFvwQyPo/I17LtMIQS30oO22kk1fC/IlHHwZSxN5jSpuOqSealVvTyAxZH7L8iNlqxIdV8P8A8le8jlrS743rXToBwWTwJYeeO8twv1lFq2uissntcZSk3CCTqorRRS4Jc/eBQC1Y8PYj/MRnixfBJe8AsaVVbroW7A/v8H/7If8AsirBxYxdyUZdzbX6DBl68WnqpKvMJJJ8feRxLJZjxZShq47TJJXo9W6b/sWJxuvS5fXvKj18x5va23ShBqDrv5maWJrXpEmqbtWtU+HvV/Aw7dg4dxUtqbi+LXJ8upL8ajy8STlGTk7t8wk4JRnajyly8C70LBcH/wD1puvVVVb79eHz7qM0ZSw5OLedV1zL4mZPGu2Xx6UsdQwko3K078Cc9k3mzUuMdV79TzXtKk9E11s9zYHSrk0vNGOXla3fry+zMSsOujZreJ3lfa2wywJ+qrhiXJV8UY8PF5VWnWzWsYorRnqdi9m7Fi7JOePNLETlmbxMrw0l6jjH962eWWYeFFxbbprw/wAs2jDgSivxxb4cH5osxMTDa9WDi+t3XD+/mXUAKt5hWrw3yvV1yv8Ar5nYzwqejUsujt1dLl5mzB2HEms0UmvFL64DF2GcIuTqlxpp19WBjji4WmbDba6Nq9PnZHeYWn3b4a+s+P1Zvwez8ScVKKVPhqlw/wAEvszF6JdbfDp+oHkSat1ouRw9XE2GcYuTSpd6/QYew4koZ0ll6tpc6A8pMHsLs7Eb5V1vTWvmiuexzTiqXrcOV6J1r4oDzFNrm/MZ31fmet9nYmmi1rmudfNHV2biPlH+Zd3zA8jO+r8xvJdX5nqrs/EcsiScsubiuF1xOrs7Ed0o6cfWX1zQHk531fmM76vzPSxNllGcYOrlXPTV1xLPs+fd3d/fYHk531fmM76vzPTw9inJtJL1W09VxXEkuz8TovNd/wAgPKzvq/MZ31fmev8AZmJ/x7/WWnH5MhHYMRwU6WVq7ckuvyA8yM5WtXx6nN4+r8z1vs7E5qK4/vLlfyK8DZJYibilp1aQHnbx9X5nN5L2n5nsPsvFWrUa15p/ocXZeL7Mf5o/MDyd5L2n5nN5L2n5nqPYZqcYercrrXTTqxi7BOEXKSjS46rS6+YHl55dX5jO+r8z1cHs7ExIqUVGndXJL9fed+zcS1Go23S9ZcQPJzvq/MZ31fmemtim5uKytqnxVancTYJxi5NRpK7TT6fNAeXnfV+Z3eS6vzPVfZ2Ikm1FXVesudfM5PYJxi5NRpK+Kvly94Hl7yXV+ZzO+r8z08LYZzipRSpulbSt6fMYmwzjVpes0lqnq1oB5md9X5jO+r8z1/s3E6R/mXd8yGHsGJJNpLRtO2lquIHlNnD2F2diNXUa65kVR2aTm4KrTp6pLjXPvYHmA9p9l4qVtJKr4pkfs3Eukk+POuF9fADxwerDYpyckkvVdPVHcXYZwVtKrrRp82v6AeSD2MTs7EjxUf5l3/JkcbYZwTcsunRp/oB5IPXwtgnOKcaprnp1/wDyd+zsS6qObpmXf8mB45fCTWHJrRq2vI9KPZmI+Ci64+sutcTDjr1Jfwv9AMP2jjfmP4D7RxvzH8DKANX2ljfmP4EJ7biy4zbKABaton7R30vE9p/ApAFq2ia4SL49q46qsRqvAxgYPQxu29pxIqM8aTS4Wl8jM9txH++/gUAmD2joBRZg4TnJRXFujssGm1zTr3ktlw8zpcW0kWPCoDsuzMRfu/FHJ7DOMW2mlz1X1zLIbNKS0Oz2SaTbTpAY8ne/rkaY9mSavTzIbstwsCUvwvh3gQxOzZRi5OqXeZ3DxNs9jmo2+Hj4FG7QFG7RJJ1WZ10vQt3aG7QFG7Q3aL92N2BzC2eUotJvLa0vTyLPsufd5ncPZ5SVx6rmWehT63XfwAy4+yPDdS+BWoVqmzVjbO4upcSvdoCqUbq23XC2R3SL92hu0BRu0aJ7JJt5pW+9tvg3/RnN2i97HLXu7+4Cp9my7n7/AOxm3aN/oM/pmfdgUbtEo4WZpX3It3Z2GDbSXFgdw9ik7inxdNXxo5idnSim21p0ZdDYpu64+PfQnsc1Ft8Fx18AMO7Q3aL92hu0BU46VbpcFegw8P1llbTvR8C3do7HCtpLmBL7OnK23etW3b/Q5Ps6UVmdV4/2L47FiPh1q7+vpCexTSd8lrqBha0q3S5XocjGrptXxrmXbtDdoCjdo0x2Oc1HW1Tq3wqiO7RdDZJNKuadagQfZc1q68/7GeeDTau6dG30Kd1/Xx+RTLBptPigM7gjm7Rfu0N2gKpRurbdcO4lg4TzLK2m+d0T3ZKGDbSXECMdib4tLxfj3dzO4uwSjFydUu8uWxS+n4/IYuxyirk/iBh3aG7Ro3aG7QGfdor2tvLK228r4u+Rr3aMu2qoy/hf6AeIAAAAAAAAAAAAA9pHTiOgW4Mqss3hDZ55ZKVXTTOylbb6uwJrGfL9S7EhipNyTy9btFeBu6e8zXpWUsxsSDi0p4jfST0Ao3ncdjjNcG14MrLcDaHC6Sd1x7gOb5/TObzuLcTa3KLjS18frkZwJ7wbwgAJ7zuG87iAAsWM1w0943rO4WO4qkk9U9e4ue3yfGMfiBQ8Vvjqc3hLGxnNptJV08SoCe8G8IACe87ju+f0ys1LbpL92Px6V1Aq38ur8yO8ND26XSPx+ZkAnvDqxCs7GVNPo7Anvn9Mb58OXiT9Klrw1d/p8juJtkpRytRqq0QFW87hvO4gAJ7zuG8IHU6YE9++vxLMXa5SUU/3Y5VWmnf1OvbZdEtb+vIi9rllcajTVcAK953DeEABPedx1YzX+Ss0Ye1yilSWia58wK98/pnHily26Sd0vq/mUYk80m3zdgd3ncN4QAE953BYrIEsOeWSa5AS376vzDxm+P6lvpslGkkuOqu9f8jG2yU4uLSSfiBTvO4bzuIACe87jLtjuMv4X+heZ9q/DL+F/oB4wAAAAAAAAAAAAD2jpw6BLDrMs3C9fA5OrdXV6X05FmAouSzaRtW1q65nZxVuuF6eAFcKtXdXr4cy7EjhZW4zk3yTRZhQwmvWbT7lf9DuJhYKi8s23yTjVgYi3Z8l/eN13CkXYccOvWu+7/AEZ7rK8rldaX10/uZjbiQwaeVtvl9V4megKgW0KAqBbSFIBg5K9du7XlzL62f2pfH5EcKOG08zp2vLnyLFDB11f17gM+07u/u22q59Sk1Y8cNNbttrvKqAqFltHaApNP3PWX0v8FdGlQwdbb7vqgK5rAp05Xy+qMpucMGuMvr3GagKiUKzLNwvUnRKCVrNw5gWQWBzcuPfwv5Fc91TyuV0qv3f3NEIYH70pe5d/gRx4YKXqOTff4+HiBiBdQoCmzsatXwvXwLaOxStXw5gWQWBrcpcdOlaf3I4iwdcrlw0vqXRw8HW5Pjpx4eRzEw8GnllK6005+QGGxZdQoCk0YSwqWZyunfjen9SNF+HDCpZm1o7pc+XICD3F8Z118/hwM+JWZ5eF6G5Yez+1Ly/sZppW64cgKLFl1CgKSWHWZZuHMsolhqNrNouYHVuaesm/LrX9Dk91keVyz3onwqy5Qwa/FK/Dy5EZwwsryt5r0Xd5AYxZdQoCmyGP+CX8L/Q00Z9q/DL+F/oB4wAAAAAAAAAAAAD2wABLDrMk3SvV9wm6bp2r0fUngYblJRStt0jsoU2mtVoBHCyu80mtNGW4mFBRbji5muVVzK8q6GjF2fDSuOIn3VqBjtk8FRb9eTR2l0LobJKUVJR0feutAQxIYajccRt1wrw/uUWap7JJRzNKvFFNICuxb6lmVCl0ArtiyzKugpdAO4Si080mna8uZY4YSaW8k11/tRGGA5K0udFnoU/ZXmgM+NlTqEm1195CzRi4Dg6kqZXS6AV2L7yykKXQCvMzTHDwr1xWlpyfvKqXQvjsc22lFWuVoCO7wtfvH8/gZrZrexzX7q80UUugFdksOnJJulerJZV0OxhbSS1YF0MLCrXFad/C/Aji4eGotxxG5dPpHY7JJ8Iri1xXIYmySirlFJeK60BltjMWZV0GVdAK77zsXqrdK9SeVdDqhbSS1YFscPC1vFfH4af3OYmHhJPLitutFT1ZJbFOryquHFHJ7JKKtxVeKAy5n1GZ9SzKugyroBXm7zRhQw2lmxGtHfc+RXlRbDZJSqo8Va1XIDscPB1vFkvcZ8SlJqLtXozU9hmnTir8V9ciiUKdNAVX3i2WZV0FLoBXbJYdOSUnS5sllXQlDDzOktQJxw8LnitceXfp8CGNHDS9Wbk/wBCa2WT5LzXf8mJ7JKKzOKrxXWgM1vqLfUsyroMq6AV2yvH/BL+F/oaMq6FG1fgl/C/0A8YAAAAAAAAAAAAB7YAAnhyrW6LIY2WWZPXXj3leFKpKSV00zjTbt8wNa26l+GHkcxNszJpxgr6Ip2fEcHeVS7n4p/0J4mNmTW6gu9LUCvMiUcdrRSaXiynKyzBm4O8qfiBKWO3xk372RzIsxtoc1WSK0rTxT/oZ8r6AWZkMyK8rGV9AJ50MyIZX0GV9ALY41cJNeB30h+2/NnMLFcFWVPVPXuLvTH+XD66AUyxb4tvx1OZkd2jEeI08qVKtCvK+gEs6O5kV5X0GVgWZkS3745nfiynKzVHaqbe7jqBW9ob4yfmyGdGl7a/y4mPKwJ50dU11K8rJYdxknV0+AFm+ftPzZyWO3xk34tsnHaWrqC+WtnZ7Q5Rcci1rXnpXyApzoZ0QysZX0AnnR1TXUrys7FNNOuDAuW0y9uXmyLx29HJv3sujtbV/dx42QxNolJNZUrVae7694FWdHc6K8r6DK+gFmZEljtaKT82U5WaMLaMqXqJ0mte8CL2hvjN+bIud6tl8tsv/bivDwM2JcpN1Vu6A7mQzIhlYysCeZHViVqnTK8r6EsO4yTq6AnvnwzPzDxnVZnXS2WR2lr9xc/6/MY20uUXHJFX048QKc6GZEMrGVgTzop2l+pL+F/oTysr2hepL+F/oB44AAAAAAAAAAAAD2wcOgW7PiKMlJpOndPmdlNNt6K3dEMKVSTccyTunwfccnq26q3dLgu4DYtshX7OL7yOJtEGmlCK70Rw8aCSTwU6S1vi+85iY0XFpYKi3wafACvMupbg46hdxjK64man0LsPFqNPDT72u8C7E2xSi1kir5mfMizE2hOLSwoq+a4rh8jPlfQCzMhmRXlYpgWZkMyK8rFPoBpwsdRTTinqnqWvb1f7OFVVcjLhypNOCeqdvuLVtEaS3MdPiBzHx1N2ko9yK8yGNPM7UVFdEiun0AszIZl1K6fQZWBZmXU0elx9iPL9P7mPKzVHaIq/uYv68ALHtsfy4GXMi/0qP5EPr3GTKwLMy6ksPESaejopyvoSho02rp8ANUdrir9SL1v43RzE2qMk1kgu9ceXyOQ2mKVbmL1b1rrw4EMXHzJpQjG+iXdzruAhmQzIrysZWBZmR2M0mnxoqysZX0A2w22K/cjxs5La4tNZI6/DQreOvyo8b5dKrh7yMsZNNbuK70tfrQCOZDMivKxlYFmZF+HtUYpepF0nx7zJlZow8ZRSvCTpNa1rr4AXPbY/lw+r+ZnniJtvhb4EntCv9lHhw8+7v+BRPWTajVvguC7gJ5kMy6leV9BlYFmZEsPESaej7inKyWHpJNq65Aa47ZFKt3F+JHF2pSVKMV5EVtCUa3Ub11aT/ocxcdSi0sKMX1VXx8PqwK8yGZFeV9BlfQCzMijan6kv4X+hPK+hXjr1Jfwv9APHAAAAAAAAAAAAAe0dAAu2eajK2lJdH4E8bEUpNpKKfJcinCklJNrMugxWnJuKyp8F0A2ekYXPDXnRHExsNppYdPk8z0M2BKMZXKOZdCzExMNxqOG0+TsCNo0YO0wjGnhxk+tmGi+GJFJepb5t+OgF+JtOG4tLDjF9UZrLJ40HClhJS9ozUBbYsqoAW2LKqAGrCxYxWsU9UXy2zDa/YxXv/sY8GcUvWhm1Rf6RhfkgQxsWMncYqK6X3ldndpnGUrhDIq4FNAW2LKqFAW2aVtMF/tx+HTwMNGlYuHr918eGj+vcBb6VC/2ca6Uu/nXh5Gay54+F+V+hkoC2yUJpNN6roUUSjo02rV8ANkNpgrvDT+vA5PaIOLSw0m+emnwKd7HX7tavT4f38yU8aDg0sOpPg+gFdiyqhQFtnYySafHXgU0djo02rV8ANkdpgn+zi1d613d3d8Q9ohla3cbfB9CEcbDV/dXrfHw0OTxsNxaWFTriBCzllVCgLbNGHtEElcE6T6amKjRh4sElmw7aTvv14gaHteHT+6j8PkZpyTbeit8OhZ6Rh0/ufB3wM03bbqrfBcu4CdoWiqhQFtonhzSkm6a6GeiWG0pJtWugG2O04aX7NPy6+BDEx4NSSgk3VPoV76CWmGvfr17u9eR3FxoOLUcPK70fTuArsWVUKAtsp2r8Ev4X+hKiWVbrGtcIP9GB4IAAAAAAAAAAAAD2wcOgXbPFN03Xf7ieNBRk0nmS5lOEm3STb6IlNuLakmmuQFmHhuT9VWacfDiotrDxIvvVJGbDeJFZoqST5oliY+LlqWfLztP4sCo0+jw/MXw+Zj3hPDTl+GLfgBfLBik2sRNrl14cPP4FBKWFNK3Fpd5VvAJghnGcCYIZxnA0YWEpLWSWq4lz2WFP76Pl/cyRTktIt60TeDP2GB3Fgo1Urtf1Kzk7i6kmn3kd4BMEN4N4BM1Q2WL44kVounzMW8LVhT9hgXzwIJv7xOui7vEzE9zP2X8CneATJQjbSbq+ZVvDsZW6S1YGuGzxfHES1rl148RibPBRbWKm+iKI4U3wgzk4SircWkBwEM4zgTOxVtK67yvOdUr0S1A2x2WLX7WPHhp8yt4EcraxF4e7xKo4U3ooPjXvEsKaVuDSAiCG8GcCZow8CLSbxErT6afEyZyyMJOqi3eqA0+iw/Nj8P6sz4iqTSdpPiN1P2JFcpU6apoCQIbwbwCZLDinJJul1Kt4djK3SVsDVuI1riK/8nMbBhFWp5u5V8ypYU/Zf1/gTw5xVuLS6gRBDeDOBMo2r8Mv4X+hZnKtpdwl/C/0A8YAAAAAAAAAAAAB7R0WLA7GTTtOmJSbdt2zl9wsCyGPOPCTR2W0zccrk3HoVWLA4ThNxdxdPuIoWBOeNKSqUm0QFiwAFi+4DgO2LAnDFlHg6JekT9p9CqxYHZTbq23WmpwWL7gAFiwCi3wNOHLFkri29a463xM1nVNrg68GwLMKWJKXqtt8frzJPYsT2enNcyhSrhoS30vafmwLPRMT2fihDZ5/iS4Pja0d1+pXvZdX5s4sRrg2vewL8SWLhunJpvXiVTxpSVOTaIuV8dfM5YACxYAJ0LFgW+kT9piW0zaacm0+NlViwBw7YsAWRx5pJKTVFdiwLfScT25eZXJtu3q2csWAOHbFgcJQk07XE5YsCz0iftP6/wAiePOSpybXQrsWAAFgWYeDKSk4q1FW+4px4PdSlWlNX30TUuPfx4leO/Ul4P8AQJN3144ACgAAAAAAAAAA+he0XiTm4p5r05K2dltEWq3UV4FO07Fi4ONusZ5HV3yarRp/ApxllrLiZrvlVUxu+pmNq2pJprChpyrR8OPl8Tvpi/Kh7keZnfUZ31A9DDx4x/2oy/ibKb1Mud9Tu8fUDVdu6S8CJn3j6jO+oGgGfO+ozvqBoBn3j6jePqBoBn3j6lmzxliTjBOnKSSb4avmBYDR2h2ZibPFSliQknLL6rbfPu7ijZdmlipvPGKTSTlerfLRMtln1OPKcpscBXGE3PJ+9eWu+6Z6PbHYmLskITniQmpyyrI3o6vW0iNMQL9l7K2jGwZY0F6kdL8Cns/ZcTaMRYeG1mab1dLQXybTHAWdodn4+zOKxUlmuqafCr/Us2fsvExMPPGcc2VyUL1aXMnGzlNnxOVnD+zOCzszYMXapuOE1cY5m5SUUlwv4nO0tgxdlxFDFatxUllaaabaT+DKuIAslsGMsHffuaeOvDuK9k2fExpNR5atvgl3i+fTjO3kAR2rCnhTcJvVdHaL8fYMTDw885JNVcXebX3d5eMvL4nL8fKqB3ZNlxMW3F0lVt8FfAhi4M44u6TzSzKKrm3VV5mdTtNz9pA29qdh4+y4axJzhKLaTyt6N+K1WnEz7HsOJjfg8NXWvQqz34qBxYM3iLCX482Wn1NHaXZmLsyi5yTUua6lk2ab+lAJ7NsWLiQc4tJa1bpulenUyub7yC8GfO+pzO+oXGkGbO+o3j6k1ro0gzKcuTZKp5stPNdVzvpQ06LwVbRh4mFLLiRlCXGmq06nMRTi6b1GnVcTw8TK7pPxKp7NjRlGLi7lwVo4sDFbkqfqVmtpVbpXY1OrT6RreSH8vH6ok9q0/BDyPPlKSbTtNaNc7XI0bpV+3jxS9zq3x5W/5TXZOrrdts5tWJmw3olUGtFx0595OGxynNQwcSOI2rdOqWnG+HEy7Zhzw3KE9HXW001o0yL1eYAAgAAAAAAAAAAPUxe1ZYmIsTFm8SSVevbVdDj7Qg01kgrXKL8zzAB6a2/D/Lh5MoePG+PwZjAGzfx6/Bjfx6/BmMAbN/Hr8GN/Hr8GYwBs38evwY38evwZjAGzfx6/Bjfx6/BmMAbN/Hr8GTwdsUJxmmri01adWjAAPY2ztp40cs8qV36sWtfpleB2m8OLjCSim7bypu601a5f1PLBbbfrPHjOMyN+HtijJSUtVrqmW7V2rPGpYmJKSWqTuk+tHlgjT18HtmcMN4cMRqD4pWV7J2k8GefDlUqaunwfE8wC+zKPX23tZ42VTm5KN5W071q78itdoLJWZp1Wif4elnmATz4lkv17PZnbUtlnKeFlblHK1OLaq0+TXQ52p21La8RYmLlTUVFKEWlSbfO+rPHAV6su1pPCWE5+qlXDWulleBtqhK1JpPjpfwPOAvv04/j7G/F2tTk5OVt9xPE27MtZyk3xtP3anmgstnwvt2vSwNuULWdpPikuPTQg9sWbPmqVp2r0a4MwAzidZuvd2/t+ePhxw5yjlTzNRhluWur68e7jwOdn9sLBupyj0qClr11aPDAyLffr0vT6msRSeZO7rn1L+0O2pbRlU5aR5KLWvxPGBZcmRM/b1tn7VeHCUYzaT5Vfc9eRl38ev6mMBWzfx6/A5vo9fgZAFlxr30eo30evwMgJi9q2LHj1/U7DalFqUZNNVTVpquFGIDDtXobRt8sWWbFxJTlVXJuTrpb8WQxNrzu5SbdVrZiAw7N3prtSzvNHRPW1XQlHb2m2pu5VfHWuFnngYdm/B21Qblo20/xJvjz8TR9r63lw+FfhfWzyAMTXq/aslNThJQklXqp6rvu7M+0bXvHKUpXJrp3aIxApoAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "108151d8-743e-49fb-ac0c-c81680a998bc",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives \n",
+ "- Import and Prepare Data \n",
+ " - Import necessary Python libraries. \n",
+ " - Load and separate te 2021 training dataset into features and labels. \n",
+ "- Create and Train a Decision Tree Model\n",
+ " - Initalize and train a Decision Tree model using the 2021 training data. \n",
+ "- Visualize and Interpret the Decision Tree\n",
+ " - Generate and interpret visual representations of the trained Decision Tree model. \n",
+ "- Make Predictions and Evaluate Accuracy\n",
+ " - Use the trained model to make predictions on 2021 testing data.\n",
+ " - Compare predicted values with actual values to assess model accuracy. \n",
+ "- Calculate and Compare RMSE \n",
+ " - Calculate the Root Mean Square Error (RMSE) for the 2021 model. \n",
+ " - Compare the RMSE of the 2021 model with the 2020 model to evaluate improvements. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9d8356f9-d9e2-44e5-8463-f140b038611a",
+ "metadata": {},
+ "source": [
+ "# Prerequisites \n",
+ "***Modules*** \n",
+ "\n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "735d2e88-9015-4711-ab57-87fa6b2ea661",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "Copy-paste the required lines of code from ***Introduction to Machine Learning: Decision Trees*** for each section below. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b1e1a46d-6e6c-4386-a0ca-0fcb19009e26",
+ "metadata": {},
+ "source": [
+ "## **1) Repeat Step 1 (Importing Necessary Packages)**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ce366147-fc2c-416d-8f28-325bacbb28e3",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "54218111-757c-4b75-80aa-2754ffd916df",
+ "metadata": {},
+ "source": [
+ "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
+ "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5812834c-28ec-492e-bebf-0e5cc34d4270",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d98ca761-0f7c-4f5e-acf4-05331b993b22",
+ "metadata": {},
+ "source": [
+ "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
+ "SKIP:\n",
+ "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "966fee45-4f0d-4f98-8819-a9543a6ab0bc",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "26d25871-caa2-41b5-b800-6224932ec759",
+ "metadata": {},
+ "source": [
+ "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "14883b4f-d547-4de2-b15e-1602f1c279ae",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d73e8261-7466-4845-b847-1088e2610828",
+ "metadata": {},
+ "source": [
+ "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b4eb80ad-bc43-4405-9dd4-c671933d6597",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f7c646d7-46c5-475c-a446-5decacb0f6df",
+ "metadata": {},
+ "source": [
+ "## **6) Repeat step 6A, 6B, 6C (Load Testing Data and make your Predictions)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8dcb9a83-f172-4194-b2aa-0edcc534a191",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fbefce65-467c-46d0-abde-d02662c1d071",
+ "metadata": {},
+ "source": [
+ "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c3c6d248-22da-4ddf-b270-bdb5092863fd",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5b90d2e6-520b-4dd7-810d-85ae7586b009",
+ "metadata": {},
+ "source": [
+ "## **8) Extra: (Calculate RMSE and create Aggregate error histograms)** \n",
+ "\n",
+ "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2b5c7f98-e48e-4a7e-a1dc-954a528bbec0",
+ "metadata": {
+ "jp-MarkdownHeadingCollapsed": true,
+ "tags": []
+ },
+ "source": [
+ "# Conculsion \n",
+ "In this practice, you successfully: \n",
+ "\n",
+ "1. **Imported and Prepared Data:** Loaded the 2021 training dataset and separated it into features and labels. \n",
+ "2. **Created and Trained a Decision Tree Model:** Initalized and trained a decision tree model using the 2021 data.\n",
+ "3. **Visualized and Interpreted the Decision Tree:** Generated and interpreted visual represesntation of the trained model. \n",
+ "4. **Made Predictions and Evaluated Accuracy:** Predicted outcomes using the 2021 testing data and assessed model accuracy. \n",
+ "5. **Calculated and Compareted RMSE:** Calculated the RMSE for the 2021 model and compared it with the 2020 model.\n",
+ "\n",
+ "By completing this module, you have reinforced your understanding of decision trees and gained practical experience in adapting machine learning models to new data. This practice not only enhances your technical skills but also prepares you for real-world applications where models need to be continuously updated and evaluated. Keep exploring and refining your models to achieve even better predictions and insights! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "12602949-b5ef-4ba2-89ff-a5617507e366",
+ "metadata": {},
+ "source": [
+ "# Clean up\n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges."
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m108",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/AWS/4- Practice - Answer Key.ipynb b/AWS/4- Practice - Answer Key.ipynb
new file mode 100644
index 0000000..716b722
--- /dev/null
+++ b/AWS/4- Practice - Answer Key.ipynb
@@ -0,0 +1,498 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "328b4b3b-e1b3-4490-8f76-0241ed6c3b5c",
+ "metadata": {},
+ "source": [
+ "# **Practice Answer Key: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d096f58c-168f-45bf-86f6-c8430077e862",
+ "metadata": {},
+ "source": [
+ "# Overview \n",
+ "This module contains the answer key to ***Practice: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!***.\n",
+ "\n",
+ "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
+ "\n",
+ "* You DON'T need to copy-paste the comments from the original code (the green text that is preceded by \"#\"). \n",
+ "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 data.\n",
+ "\n",
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "72283446-4a79-480f-8ec7-d72dcf6f7a83",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGBoeHRsfIiomIyIhIiolJigoMCkyMC0oLS01PVBCNThLOS0tRWFFS1NWW1xbMkJlbWRYbFBZW1cBERISGRYYLhsbLVc/NT1XV1dXV1dXV1dXV2NXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV11XXVdXV1deXVddV1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAgMEAQUGB//EAEAQAAICAAMFBAcFBwQCAwEAAAABAhEDEiEEEzFBUQVhcZEUIlKBodHwFSMyU7EzQnKSweHxBkNiohayY3PSJP/EABcBAQEBAQAAAAAAAAAAAAAAAAABAgP/xAAfEQEBAQADAQACAwAAAAAAAAAAARECEiExIkEDMmH/2gAMAwEAAhEDEQA/APz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHtYfZMZZqy+rxuT+BN9iKr9Tg3+N8iMYt8FZLCwnJ1FfFLnXFgRxOx1Gry6utJPz8Dq7FtJqMdVf4tfAji+paejTpoi27rI76VqEWy7Eq7jGkm/x9PeRl2OlVpK+HrHHCaV7uVacuvA5Uvy5eTCox7Ni1N5V6it+t+nUv7I7A9LxJwhKMMqu5X1Kmp6fdy1dLR8UIRxM3qRxIt+zcScpbPCIfZDvESyvJNxfrU20+KT5HfsWfJRfhNfXMjuZSf4Jt+DbvyI7r/jLQsCPZjbrS7a/FwaaWr95P7Hl7K/mRBYd/usLD/wCLCJR7JbWijxaazrkc+ynTdR0bT9bXQ48L/iwsPnTCq47EnCU6VRaT1116dTX2b2J6S5qMoxyK3mdX4FO6fsv4HYxktUpL30WZvqX/AByHZbcM6y1dfi14tcPd8UTfYsk69T3TRDdf8X8BuX7LIK8fYd20pLirVO9CrcR6GlYT9mRxw7mBn3Eehds3Z+9bUa0TerrgT3Tq8rORhfCMgqv0JZM9LLmy8dbq+BLA7P3jko16sXJ3JR0XiyW7/wCMju7fsyAzbiPQbiPQ0ShXFNEdO8Ip3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvCqdxHoNxHoXad407wKdxHoNxHoXad407wincR6DcR6GhQ0unS5nVh2nJRlS4sDP6PGr/rqFgR6fE14ez5vw29L9yLcLs+c8RYUYt4jdKKriMN9x572eK/yWw2KLhKd1XLXXhz9/wADT6E87g01JXalpVau78GSxOzpRxN3JVKk+KejVp2u5gtx56wI9DRgbDD0jCw5VJSxIJ03TTkk6fvL9o2F4cssuJZsWz1tOCnx3mG/+yaCTlLNj6P/AMY2P8p/zz+Y/wDGNj/Kf88/me1QoK8X/wAY2P8AKf8APP5nH/pjY/yn/PP5nt0KCvDf+mdj/Kf88/mRf+mtj/K/7z+Z7ckeVt/auR5cOOdri+QXGd/6b2T8r/vL5kH/AKd2T8p/zy+Zs2HtFYukllkacQmpZjyf/H9k/K/7y+Z1f6f2Pnhf95fM0vFZxMD5pacGdhNxdp17j0NtwcOMZKMFFJLLNSbblzjV+PkZNk2jdybyqVqqfCuaM8OfebF58bxZpwT4slKU3LO8STlVX3G3B2+McNQeEm1+9pfHjw4rTyLPtKNVuk9Kt8b9rhVm2WDfYlJb2VLgFjYi4YsjdLtGDb+6StNXfrJutVy0ohPbVJweWKUHdVblrrb4fAKywxsRO94746rnws7HaMVXWLLx5+7zfmzRhbThxgoOCxK/e4PV38ifpuDX7CN1T18Pl4gYli4i4YsiDzPXOze9rwadYC104/H66EfS8PJW6WfKkpXw9WuHmwMDg3+8ztP22ejHbcOvWwszXBOqWnBacNDj2zC/IT0p6/2/wBgSftsisP8A5M9B7XhZ29yqpJK6rjb99kpbZgv/AGF42EedkftsNP22bcPacJRSeDbpK759eHv+B3aNrw5p5cLLbt17/r3hWHK/afkMj9tnoT2vB1rBV6Vfhr46lO2Y8MSScMPJpw+vADM03xmzmV+0y7ExE0lSVfXz8ysIjlftsKDXCbJADnre2xT9tnQBCWHfGTZHcLq/ItAFW4XV+Q3C6vyLQBVuF1+A3C6/AtAFW4XX4DcLq/ItAVVuF1fkNwur8i0AVbhdX5DcLr8C0BFW4XX4DcLr8C0AVbhdfgNwuvwLQBVuF1+A3C6/A9TD2TBez53ifeZZPLnjxTdLLx4alMdli0msWFtK09KdXQGTK8uXM66UFBqLjmdPiq+uiNktjSdPFha5amaSpvnQDDbg04umuDJ4ONKE1OEmpq6kuOvMux9kUIRlni26tKUXVrub7zr2OOv30PMaYpljzeI8Rybm2220nd8bvxOz2mcp55SuVJXS4JUlS7tDmFhKWJGDmoxclFz5JXWbw5l/amywwcXJCTksqeri3F+y8rav5gsZ8bGeI7k9S3YXe0YLb/3If+yNsNjw91ema67+C1v3mLYdNpwv/th/7Irj/F/Jx5bx4z4+6oULOh2co4yRxkV5nbO17rCdOm9EeAsRR4mn/U2Pei/dMOkldcjNdeL1cFp0v3qtMuwu08Obyt5ZdHz8GU7IllhNxpxT17jxNs1jDE9pP9WSX05x7z4nUzHsGJeDDwo0ZjTi+dNGFtTjHLkjJXfrKzOmX4e04azXCLTdpXwXS+JVWen/APxYfH2e6ivB2lwzerGWartWhtO0Yc6yQjCujuyOHjxTi8sXSaa43x11+tALVtzX+3h/ylePj569SMa9lUTxtrwpJqOFGLeiea6M6d8NQNj7RbdvDhxvhr5lONtOdVkiudpa8SjMuqGZdV5gaI7VSSyQdKtVqWenafssO+uX+hjzLqhmXVAXYO0OGb1U1Limiz01/lw/lMuZdRa6gWvHedzSS7louFE57XcXHdwV9EZ8y6oZl1QF+FtTikssXXC1rxsjLGuedRS7lwKsy6oZl1QFzx7d5ESe0r2EZ8y6rzGddUBf6Qq/AvgceMqSyIpzLqvMZl1XmBo9IX5cR6Svy0Z8y6oZl1QEpyt3VHDmZdUMy6oDoOZl1QzLqB0HM66oZl1QHQRc11QzrqgJA5nXVDOuqAWdOZl1XmMy6oDoOZl1QzLqgOg5mXVDMuqAWdOZl1R2M4rjT94HVFvhFvwQyPo/I17LtMIQS30oO22kk1fC/IlHHwZSxN5jSpuOqSealVvTyAxZH7L8iNlqxIdV8P8A8le8jlrS743rXToBwWTwJYeeO8twv1lFq2uissntcZSk3CCTqorRRS4Jc/eBQC1Y8PYj/MRnixfBJe8AsaVVbroW7A/v8H/7If8AsirBxYxdyUZdzbX6DBl68WnqpKvMJJJ8feRxLJZjxZShq47TJJXo9W6b/sWJxuvS5fXvKj18x5va23ShBqDrv5maWJrXpEmqbtWtU+HvV/Aw7dg4dxUtqbi+LXJ8upL8ajy8STlGTk7t8wk4JRnajyly8C70LBcH/wD1puvVVVb79eHz7qM0ZSw5OLedV1zL4mZPGu2Xx6UsdQwko3K078Cc9k3mzUuMdV79TzXtKk9E11s9zYHSrk0vNGOXla3fry+zMSsOujZreJ3lfa2wywJ+qrhiXJV8UY8PF5VWnWzWsYorRnqdi9m7Fi7JOePNLETlmbxMrw0l6jjH962eWWYeFFxbbprw/wAs2jDgSivxxb4cH5osxMTDa9WDi+t3XD+/mXUAKt5hWrw3yvV1yv8Ar5nYzwqejUsujt1dLl5mzB2HEms0UmvFL64DF2GcIuTqlxpp19WBjji4WmbDba6Nq9PnZHeYWn3b4a+s+P1Zvwez8ScVKKVPhqlw/wAEvszF6JdbfDp+oHkSat1ouRw9XE2GcYuTSpd6/QYew4koZ0ll6tpc6A8pMHsLs7Eb5V1vTWvmiuexzTiqXrcOV6J1r4oDzFNrm/MZ31fmet9nYmmi1rmudfNHV2biPlH+Zd3zA8jO+r8xvJdX5nqrs/EcsiScsubiuF1xOrs7Ed0o6cfWX1zQHk531fmM76vzPSxNllGcYOrlXPTV1xLPs+fd3d/fYHk531fmM76vzPTw9inJtJL1W09VxXEkuz8TovNd/wAgPKzvq/MZ31fmev8AZmJ/x7/WWnH5MhHYMRwU6WVq7ckuvyA8yM5WtXx6nN4+r8z1vs7E5qK4/vLlfyK8DZJYibilp1aQHnbx9X5nN5L2n5nsPsvFWrUa15p/ocXZeL7Mf5o/MDyd5L2n5nN5L2n5nqPYZqcYercrrXTTqxi7BOEXKSjS46rS6+YHl55dX5jO+r8z1cHs7ExIqUVGndXJL9fed+zcS1Go23S9ZcQPJzvq/MZ31fmemtim5uKytqnxVancTYJxi5NRpK7TT6fNAeXnfV+Z3eS6vzPVfZ2Ikm1FXVesudfM5PYJxi5NRpK+Kvly94Hl7yXV+ZzO+r8z08LYZzipRSpulbSt6fMYmwzjVpes0lqnq1oB5md9X5jO+r8z1/s3E6R/mXd8yGHsGJJNpLRtO2lquIHlNnD2F2diNXUa65kVR2aTm4KrTp6pLjXPvYHmA9p9l4qVtJKr4pkfs3Eukk+POuF9fADxwerDYpyckkvVdPVHcXYZwVtKrrRp82v6AeSD2MTs7EjxUf5l3/JkcbYZwTcsunRp/oB5IPXwtgnOKcaprnp1/wDyd+zsS6qObpmXf8mB45fCTWHJrRq2vI9KPZmI+Ci64+sutcTDjr1Jfwv9AMP2jjfmP4D7RxvzH8DKANX2ljfmP4EJ7biy4zbKABaton7R30vE9p/ApAFq2ia4SL49q46qsRqvAxgYPQxu29pxIqM8aTS4Wl8jM9txH++/gUAmD2joBRZg4TnJRXFujssGm1zTr3ktlw8zpcW0kWPCoDsuzMRfu/FHJ7DOMW2mlz1X1zLIbNKS0Oz2SaTbTpAY8ne/rkaY9mSavTzIbstwsCUvwvh3gQxOzZRi5OqXeZ3DxNs9jmo2+Hj4FG7QFG7RJJ1WZ10vQt3aG7QFG7Q3aL92N2BzC2eUotJvLa0vTyLPsufd5ncPZ5SVx6rmWehT63XfwAy4+yPDdS+BWoVqmzVjbO4upcSvdoCqUbq23XC2R3SL92hu0BRu0aJ7JJt5pW+9tvg3/RnN2i97HLXu7+4Cp9my7n7/AOxm3aN/oM/pmfdgUbtEo4WZpX3It3Z2GDbSXFgdw9ik7inxdNXxo5idnSim21p0ZdDYpu64+PfQnsc1Ft8Fx18AMO7Q3aL92hu0BU46VbpcFegw8P1llbTvR8C3do7HCtpLmBL7OnK23etW3b/Q5Ps6UVmdV4/2L47FiPh1q7+vpCexTSd8lrqBha0q3S5XocjGrptXxrmXbtDdoCjdo0x2Oc1HW1Tq3wqiO7RdDZJNKuadagQfZc1q68/7GeeDTau6dG30Kd1/Xx+RTLBptPigM7gjm7Rfu0N2gKpRurbdcO4lg4TzLK2m+d0T3ZKGDbSXECMdib4tLxfj3dzO4uwSjFydUu8uWxS+n4/IYuxyirk/iBh3aG7Ro3aG7QGfdor2tvLK228r4u+Rr3aMu2qoy/hf6AeIAAAAAAAAAAAAA9pHTiOgW4Mqss3hDZ55ZKVXTTOylbb6uwJrGfL9S7EhipNyTy9btFeBu6e8zXpWUsxsSDi0p4jfST0Ao3ncdjjNcG14MrLcDaHC6Sd1x7gOb5/TObzuLcTa3KLjS18frkZwJ7wbwgAJ7zuG87iAAsWM1w0943rO4WO4qkk9U9e4ue3yfGMfiBQ8Vvjqc3hLGxnNptJV08SoCe8G8IACe87ju+f0ys1LbpL92Px6V1Aq38ur8yO8ND26XSPx+ZkAnvDqxCs7GVNPo7Anvn9Mb58OXiT9Klrw1d/p8juJtkpRytRqq0QFW87hvO4gAJ7zuG8IHU6YE9++vxLMXa5SUU/3Y5VWmnf1OvbZdEtb+vIi9rllcajTVcAK953DeEABPedx1YzX+Ss0Ye1yilSWia58wK98/pnHily26Sd0vq/mUYk80m3zdgd3ncN4QAE953BYrIEsOeWSa5AS376vzDxm+P6lvpslGkkuOqu9f8jG2yU4uLSSfiBTvO4bzuIACe87jLtjuMv4X+heZ9q/DL+F/oB4wAAAAAAAAAAAAD2jpw6BLDrMs3C9fA5OrdXV6X05FmAouSzaRtW1q65nZxVuuF6eAFcKtXdXr4cy7EjhZW4zk3yTRZhQwmvWbT7lf9DuJhYKi8s23yTjVgYi3Z8l/eN13CkXYccOvWu+7/AEZ7rK8rldaX10/uZjbiQwaeVtvl9V4megKgW0KAqBbSFIBg5K9du7XlzL62f2pfH5EcKOG08zp2vLnyLFDB11f17gM+07u/u22q59Sk1Y8cNNbttrvKqAqFltHaApNP3PWX0v8FdGlQwdbb7vqgK5rAp05Xy+qMpucMGuMvr3GagKiUKzLNwvUnRKCVrNw5gWQWBzcuPfwv5Fc91TyuV0qv3f3NEIYH70pe5d/gRx4YKXqOTff4+HiBiBdQoCmzsatXwvXwLaOxStXw5gWQWBrcpcdOlaf3I4iwdcrlw0vqXRw8HW5Pjpx4eRzEw8GnllK6005+QGGxZdQoCk0YSwqWZyunfjen9SNF+HDCpZm1o7pc+XICD3F8Z118/hwM+JWZ5eF6G5Yez+1Ly/sZppW64cgKLFl1CgKSWHWZZuHMsolhqNrNouYHVuaesm/LrX9Dk91keVyz3onwqy5Qwa/FK/Dy5EZwwsryt5r0Xd5AYxZdQoCmyGP+CX8L/Q00Z9q/DL+F/oB4wAAAAAAAAAAAAD2wABLDrMk3SvV9wm6bp2r0fUngYblJRStt0jsoU2mtVoBHCyu80mtNGW4mFBRbji5muVVzK8q6GjF2fDSuOIn3VqBjtk8FRb9eTR2l0LobJKUVJR0feutAQxIYajccRt1wrw/uUWap7JJRzNKvFFNICuxb6lmVCl0ArtiyzKugpdAO4Si080mna8uZY4YSaW8k11/tRGGA5K0udFnoU/ZXmgM+NlTqEm1195CzRi4Dg6kqZXS6AV2L7yykKXQCvMzTHDwr1xWlpyfvKqXQvjsc22lFWuVoCO7wtfvH8/gZrZrexzX7q80UUugFdksOnJJulerJZV0OxhbSS1YF0MLCrXFad/C/Aji4eGotxxG5dPpHY7JJ8Iri1xXIYmySirlFJeK60BltjMWZV0GVdAK77zsXqrdK9SeVdDqhbSS1YFscPC1vFfH4af3OYmHhJPLitutFT1ZJbFOryquHFHJ7JKKtxVeKAy5n1GZ9SzKugyroBXm7zRhQw2lmxGtHfc+RXlRbDZJSqo8Va1XIDscPB1vFkvcZ8SlJqLtXozU9hmnTir8V9ciiUKdNAVX3i2WZV0FLoBXbJYdOSUnS5sllXQlDDzOktQJxw8LnitceXfp8CGNHDS9Wbk/wBCa2WT5LzXf8mJ7JKKzOKrxXWgM1vqLfUsyroMq6AV2yvH/BL+F/oaMq6FG1fgl/C/0A8YAAAAAAAAAAAAB7YAAnhyrW6LIY2WWZPXXj3leFKpKSV00zjTbt8wNa26l+GHkcxNszJpxgr6Ip2fEcHeVS7n4p/0J4mNmTW6gu9LUCvMiUcdrRSaXiynKyzBm4O8qfiBKWO3xk372RzIsxtoc1WSK0rTxT/oZ8r6AWZkMyK8rGV9AJ50MyIZX0GV9ALY41cJNeB30h+2/NnMLFcFWVPVPXuLvTH+XD66AUyxb4tvx1OZkd2jEeI08qVKtCvK+gEs6O5kV5X0GVgWZkS3745nfiynKzVHaqbe7jqBW9ob4yfmyGdGl7a/y4mPKwJ50dU11K8rJYdxknV0+AFm+ftPzZyWO3xk34tsnHaWrqC+WtnZ7Q5Rcci1rXnpXyApzoZ0QysZX0AnnR1TXUrys7FNNOuDAuW0y9uXmyLx29HJv3sujtbV/dx42QxNolJNZUrVae7694FWdHc6K8r6DK+gFmZEljtaKT82U5WaMLaMqXqJ0mte8CL2hvjN+bIud6tl8tsv/bivDwM2JcpN1Vu6A7mQzIhlYysCeZHViVqnTK8r6EsO4yTq6AnvnwzPzDxnVZnXS2WR2lr9xc/6/MY20uUXHJFX048QKc6GZEMrGVgTzop2l+pL+F/oTysr2hepL+F/oB44AAAAAAAAAAAAD2wcOgW7PiKMlJpOndPmdlNNt6K3dEMKVSTccyTunwfccnq26q3dLgu4DYtshX7OL7yOJtEGmlCK70Rw8aCSTwU6S1vi+85iY0XFpYKi3wafACvMupbg46hdxjK64man0LsPFqNPDT72u8C7E2xSi1kir5mfMizE2hOLSwoq+a4rh8jPlfQCzMhmRXlYpgWZkMyK8rFPoBpwsdRTTinqnqWvb1f7OFVVcjLhypNOCeqdvuLVtEaS3MdPiBzHx1N2ko9yK8yGNPM7UVFdEiun0AszIZl1K6fQZWBZmXU0elx9iPL9P7mPKzVHaIq/uYv68ALHtsfy4GXMi/0qP5EPr3GTKwLMy6ksPESaejopyvoSho02rp8ANUdrir9SL1v43RzE2qMk1kgu9ceXyOQ2mKVbmL1b1rrw4EMXHzJpQjG+iXdzruAhmQzIrysZWBZmR2M0mnxoqysZX0A2w22K/cjxs5La4tNZI6/DQreOvyo8b5dKrh7yMsZNNbuK70tfrQCOZDMivKxlYFmZF+HtUYpepF0nx7zJlZow8ZRSvCTpNa1rr4AXPbY/lw+r+ZnniJtvhb4EntCv9lHhw8+7v+BRPWTajVvguC7gJ5kMy6leV9BlYFmZEsPESaej7inKyWHpJNq65Aa47ZFKt3F+JHF2pSVKMV5EVtCUa3Ub11aT/ocxcdSi0sKMX1VXx8PqwK8yGZFeV9BlfQCzMijan6kv4X+hPK+hXjr1Jfwv9APHAAAAAAAAAAAAAe0dAAu2eajK2lJdH4E8bEUpNpKKfJcinCklJNrMugxWnJuKyp8F0A2ekYXPDXnRHExsNppYdPk8z0M2BKMZXKOZdCzExMNxqOG0+TsCNo0YO0wjGnhxk+tmGi+GJFJepb5t+OgF+JtOG4tLDjF9UZrLJ40HClhJS9ozUBbYsqoAW2LKqAGrCxYxWsU9UXy2zDa/YxXv/sY8GcUvWhm1Rf6RhfkgQxsWMncYqK6X3ldndpnGUrhDIq4FNAW2LKqFAW2aVtMF/tx+HTwMNGlYuHr918eGj+vcBb6VC/2ca6Uu/nXh5Gay54+F+V+hkoC2yUJpNN6roUUSjo02rV8ANkNpgrvDT+vA5PaIOLSw0m+emnwKd7HX7tavT4f38yU8aDg0sOpPg+gFdiyqhQFtnYySafHXgU0djo02rV8ANkdpgn+zi1d613d3d8Q9ohla3cbfB9CEcbDV/dXrfHw0OTxsNxaWFTriBCzllVCgLbNGHtEElcE6T6amKjRh4sElmw7aTvv14gaHteHT+6j8PkZpyTbeit8OhZ6Rh0/ufB3wM03bbqrfBcu4CdoWiqhQFtonhzSkm6a6GeiWG0pJtWugG2O04aX7NPy6+BDEx4NSSgk3VPoV76CWmGvfr17u9eR3FxoOLUcPK70fTuArsWVUKAtsp2r8Ev4X+hKiWVbrGtcIP9GB4IAAAAAAAAAAAAD2wcOgXbPFN03Xf7ieNBRk0nmS5lOEm3STb6IlNuLakmmuQFmHhuT9VWacfDiotrDxIvvVJGbDeJFZoqST5oliY+LlqWfLztP4sCo0+jw/MXw+Zj3hPDTl+GLfgBfLBik2sRNrl14cPP4FBKWFNK3Fpd5VvAJghnGcCYIZxnA0YWEpLWSWq4lz2WFP76Pl/cyRTktIt60TeDP2GB3Fgo1Urtf1Kzk7i6kmn3kd4BMEN4N4BM1Q2WL44kVounzMW8LVhT9hgXzwIJv7xOui7vEzE9zP2X8CneATJQjbSbq+ZVvDsZW6S1YGuGzxfHES1rl148RibPBRbWKm+iKI4U3wgzk4SircWkBwEM4zgTOxVtK67yvOdUr0S1A2x2WLX7WPHhp8yt4EcraxF4e7xKo4U3ooPjXvEsKaVuDSAiCG8GcCZow8CLSbxErT6afEyZyyMJOqi3eqA0+iw/Nj8P6sz4iqTSdpPiN1P2JFcpU6apoCQIbwbwCZLDinJJul1Kt4djK3SVsDVuI1riK/8nMbBhFWp5u5V8ypYU/Zf1/gTw5xVuLS6gRBDeDOBMo2r8Mv4X+hZnKtpdwl/C/0A8YAAAAAAAAAAAAB7R0WLA7GTTtOmJSbdt2zl9wsCyGPOPCTR2W0zccrk3HoVWLA4ThNxdxdPuIoWBOeNKSqUm0QFiwAFi+4DgO2LAnDFlHg6JekT9p9CqxYHZTbq23WmpwWL7gAFiwCi3wNOHLFkri29a463xM1nVNrg68GwLMKWJKXqtt8frzJPYsT2enNcyhSrhoS30vafmwLPRMT2fihDZ5/iS4Pja0d1+pXvZdX5s4sRrg2vewL8SWLhunJpvXiVTxpSVOTaIuV8dfM5YACxYAJ0LFgW+kT9piW0zaacm0+NlViwBw7YsAWRx5pJKTVFdiwLfScT25eZXJtu3q2csWAOHbFgcJQk07XE5YsCz0iftP6/wAiePOSpybXQrsWAAFgWYeDKSk4q1FW+4px4PdSlWlNX30TUuPfx4leO/Ul4P8AQJN3144ACgAAAAAAAAAA+he0XiTm4p5r05K2dltEWq3UV4FO07Fi4ONusZ5HV3yarRp/ApxllrLiZrvlVUxu+pmNq2pJprChpyrR8OPl8Tvpi/Kh7keZnfUZ31A9DDx4x/2oy/ibKb1Mud9Tu8fUDVdu6S8CJn3j6jO+oGgGfO+ozvqBoBn3j6jePqBoBn3j6lmzxliTjBOnKSSb4avmBYDR2h2ZibPFSliQknLL6rbfPu7ijZdmlipvPGKTSTlerfLRMtln1OPKcpscBXGE3PJ+9eWu+6Z6PbHYmLskITniQmpyyrI3o6vW0iNMQL9l7K2jGwZY0F6kdL8Cns/ZcTaMRYeG1mab1dLQXybTHAWdodn4+zOKxUlmuqafCr/Us2fsvExMPPGcc2VyUL1aXMnGzlNnxOVnD+zOCzszYMXapuOE1cY5m5SUUlwv4nO0tgxdlxFDFatxUllaaabaT+DKuIAslsGMsHffuaeOvDuK9k2fExpNR5atvgl3i+fTjO3kAR2rCnhTcJvVdHaL8fYMTDw885JNVcXebX3d5eMvL4nL8fKqB3ZNlxMW3F0lVt8FfAhi4M44u6TzSzKKrm3VV5mdTtNz9pA29qdh4+y4axJzhKLaTyt6N+K1WnEz7HsOJjfg8NXWvQqz34qBxYM3iLCX482Wn1NHaXZmLsyi5yTUua6lk2ab+lAJ7NsWLiQc4tJa1bpulenUyub7yC8GfO+pzO+oXGkGbO+o3j6k1ro0gzKcuTZKp5stPNdVzvpQ06LwVbRh4mFLLiRlCXGmq06nMRTi6b1GnVcTw8TK7pPxKp7NjRlGLi7lwVo4sDFbkqfqVmtpVbpXY1OrT6RreSH8vH6ok9q0/BDyPPlKSbTtNaNc7XI0bpV+3jxS9zq3x5W/5TXZOrrdts5tWJmw3olUGtFx0595OGxynNQwcSOI2rdOqWnG+HEy7Zhzw3KE9HXW001o0yL1eYAAgAAAAAAAAAAPUxe1ZYmIsTFm8SSVevbVdDj7Qg01kgrXKL8zzAB6a2/D/Lh5MoePG+PwZjAGzfx6/Bjfx6/BmMAbN/Hr8GN/Hr8GYwBs38evwY38evwZjAGzfx6/Bjfx6/BmMAbN/Hr8GTwdsUJxmmri01adWjAAPY2ztp40cs8qV36sWtfpleB2m8OLjCSim7bypu601a5f1PLBbbfrPHjOMyN+HtijJSUtVrqmW7V2rPGpYmJKSWqTuk+tHlgjT18HtmcMN4cMRqD4pWV7J2k8GefDlUqaunwfE8wC+zKPX23tZ42VTm5KN5W071q78itdoLJWZp1Wif4elnmATz4lkv17PZnbUtlnKeFlblHK1OLaq0+TXQ52p21La8RYmLlTUVFKEWlSbfO+rPHAV6su1pPCWE5+qlXDWulleBtqhK1JpPjpfwPOAvv04/j7G/F2tTk5OVt9xPE27MtZyk3xtP3anmgstnwvt2vSwNuULWdpPikuPTQg9sWbPmqVp2r0a4MwAzidZuvd2/t+ePhxw5yjlTzNRhluWur68e7jwOdn9sLBupyj0qClr11aPDAyLffr0vT6msRSeZO7rn1L+0O2pbRlU5aR5KLWvxPGBZcmRM/b1tn7VeHCUYzaT5Vfc9eRl38ev6mMBWzfx6/A5vo9fgZAFlxr30eo30evwMgJi9q2LHj1/U7DalFqUZNNVTVpquFGIDDtXobRt8sWWbFxJTlVXJuTrpb8WQxNrzu5SbdVrZiAw7N3prtSzvNHRPW1XQlHb2m2pu5VfHWuFnngYdm/B21Qblo20/xJvjz8TR9r63lw+FfhfWzyAMTXq/aslNThJQklXqp6rvu7M+0bXvHKUpXJrp3aIxApoAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3de0a143-4cc7-4e8d-aa5b-c10052a01d1c",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives \n",
+ "- Import and Prepare Data \n",
+ " - Import necessary Python libraries. \n",
+ " - Load and separate te 2021 training dataset into features and labels. \n",
+ "- Create and Train a Decision Tree Model\n",
+ " - Initalize and train a Decision Tree model using the 2021 training data. \n",
+ "- Visualize and Interpret the Decision Tree\n",
+ " - Generate and interpret visual representations of the trained Decision Tree model. \n",
+ "- Make Predictions and Evaluate Accuracy\n",
+ " - Use the trained model to make predictions on 2021 testing data.\n",
+ " - Compare predicted values with actual values to assess model accuracy. \n",
+ "- Calculate and Compare RMSE \n",
+ " - Calculate the Root Mean Square Error (RMSE) for the 2021 model. \n",
+ " - Compare the RMSE of the 2021 model with the 2020 model to evaluate improvements. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "92d71251-8758-46aa-b8e3-d47df89220e8",
+ "metadata": {},
+ "source": [
+ "# Prerequisites \n",
+ "***Modules*** \n",
+ "\n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "564c5c91-57d0-442a-8ece-8719aebde237",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "Copy-paste the required lines of code from ***Introduction to Machine Learning: Decision Trees*** for each section below. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b8848232-eb1d-430b-b854-2077de1862fd",
+ "metadata": {},
+ "source": [
+ "## **1) Repeat Step 1 (Importing Necessary Packages)**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fa4b42e6-ac15-4f05-be80-43593f07b5d2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Data Wrangling Imports\n",
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "\n",
+ "# Machine Learning Models Imports\n",
+ "from sklearn import tree\n",
+ "from sklearn.tree import DecisionTreeRegressor\n",
+ "\n",
+ "# Model Evaluation Imports and Visualization\n",
+ "from matplotlib import pyplot as plt\n",
+ "!pip install graphviz\n",
+ "import graphviz\n",
+ "\n",
+ "# Quantitative metrics of Model performance\n",
+ "from sklearn.metrics import mean_squared_error"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6d623983-10a3-4f42-bebd-4045e2c29e98",
+ "metadata": {},
+ "source": [
+ "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
+ "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "de7771a0-9d83-4d94-8976-9595b83de3a2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 2A that will load our Summer 2021 training data\n",
+ "S2021_training= pd.read_csv(\"S2021_training.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "497c33d3-6611-476b-99e1-1e9309105962",
+ "metadata": {},
+ "source": [
+ "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
+ "SKIP:\n",
+ "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e12b600c-c282-4d2f-99b4-d3a6f78ab4fe",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 3A that separates the FEATURES & LABEL from the training data \n",
+ "S2021_training_labels = S2021_training[\"cases_per_100000\"]\n",
+ "S2021_training_features = S2021_training.drop(columns=[\"county\",\"cases_per_100000\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b44cea28-b4c6-4aa0-b82a-9f4c6edecf3d",
+ "metadata": {},
+ "source": [
+ "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b2ab294d-1f15-4adb-9e96-f3122fd146bb",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 4A that will allow us to create our NEW Decision Tree\n",
+ "dtr_summer2021 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "85b2db25-ee63-4ab7-8452-fcb52a013160",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 4B that will train our NEW Decision Tree\n",
+ "dtr_summer2021 = dtr_summer2021.fit(S2021_training_features,S2021_training_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "070c4b35-1119-4e9b-ae6d-bd00c1734f5b",
+ "metadata": {},
+ "source": [
+ "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f9e07c94-5d4d-45b5-96d2-625fe9693c82",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 5 that will let you see the NEW 2021 Decision Tree\n",
+ "dtr_summer2021_dot = tree.export_graphviz(dtr_summer2021, out_file=None, \n",
+ " feature_names=S2021_training_features.columns, \n",
+ " filled=False, rounded=True, impurity=False)\n",
+ "\n",
+ "# Draw graph\n",
+ "dtr_graph = graphviz.Source(dtr_summer2021_dot, format=\"png\") \n",
+ "dtr_graph"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e23a2bbb-591c-47f6-ad24-c413ca634cba",
+ "metadata": {},
+ "source": [
+ "## **6) Repeat step 6A, 6B, 6C.1 (Load Testing Data and make your Predictions)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d4ce9399-0aa8-48d8-a540-3e8cbcdf7af6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6A to load and see your Summer 2021 testing data\n",
+ "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
+ "S2021_testing_features"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "541994e6-e8a8-4c89-b8a7-7baae502bb8a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6B to drop the county out of the testing data and make your predictions!\n",
+ "S2021_features_test_nocounty = S2021_testing_features.drop(columns=[\"county\"])\n",
+ "S2021_labels_pred = dtr_summer2021.predict(S2021_features_test_nocounty)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "66fceb09-7728-48ef-912f-4d6161d96f61",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6C.1 to look at the labels that our new model has predicted\n",
+ "S2021_labels_preds_df = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
+ "S2021_labels_preds_df = pd.concat([S2021_testing_features[\"county\"].reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "S2021_labels_preds_df.round(3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "188fd7b3-ad8c-4aae-a835-7dbb2be8bfe8",
+ "metadata": {},
+ "source": [
+ "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3e355d76-dabe-489a-8c81-0d839235d183",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 7A to load our ACTUAL 2021 labels and drop the county since it's not part of the labels per se\n",
+ "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
+ "S2021_testing_labels = S2021_testing_labels.drop(columns=[\"county\"])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ffd2bed9-c1bd-4c30-b869-cd8cf3c18870",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 7B to make a bar graph and inspect the Accuracy of your new 2021 Decision Tree model\n",
+ "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "pred_vs_test_2021 = pred_vs_test_2021.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"hotpink\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4f751ca0-3db5-4f55-8acd-74afe0be5d9b",
+ "metadata": {},
+ "source": [
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "951a16d6-d07a-4a83-8e6c-61f256ace1d7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "45d7f1cf-2f56-4a9f-a3e2-5b1923c47066",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "## **8) Extra: (Calculate RMSE and create Aggregate errors histograms)** \n",
+ "\n",
+ "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "623e4e38-9baf-416b-92d1-ffa1312fb20a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Creating residual for our new 2021 model\n",
+ "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
+ "\n",
+ "# observe now new model with new column\n",
+ "New_model = pred_vs_test_2021\n",
+ "New_model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "745a6b98-f86f-4e3f-908e-b5565871c6c2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Loading old models that will test 2021 data\n",
+ "Old_model = pd.read_csv(\"Model2020pred_vs_test_2021.csv\")\n",
+ "Old_model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "da31b101-37f8-400b-ac4e-631d6b6af428",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Plot histogram of error aggregates for both the old and new model\n",
+ "plt.title('Cases per 100k Prediction Errors')\n",
+ "plt.hist(New_model['residual'], alpha=0.5, label='Model 2021')\n",
+ "plt.hist(Old_model['residual'], alpha=0.5, label='Model 2020')\n",
+ "plt.legend(loc='upper right')\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "82bf9a78-4062-4640-8561-76c623ede7bf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This calculates the RMSE for Model 2020 (OLD MODEL)\n",
+ "print(f\"RMSE for Model 2020: {mean_squared_error(Old_model['cases_per_100000'], Old_model['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "abeda9a5-5fa0-4544-8586-14ec531448ad",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This calculates the RMSE for Model 2021 (NEW MODEL)\n",
+ "print(f\"RMSE for Model 2021: {mean_squared_error(New_model['cases_per_100000'], New_model['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ce454920-68ea-4a48-85f9-7bb57f021b3f",
+ "metadata": {},
+ "source": [
+ "# Conculsion \n",
+ "In this practice, you successfully: \n",
+ "\n",
+ "1. **Imported and Prepared Data:** Loaded the 2021 training dataset and separated it into features and labels. \n",
+ "2. **Created and Trained a Decision Tree Model:** Initalized and trained a decision tree model using the 2021 data.\n",
+ "3. **Visualized and Interpreted the Decision Tree:** Generated and interpreted visual represesntation of the trained model. \n",
+ "4. **Made Predictions and Evaluated Accuracy:** Predicted outcomes using the 2021 testing data and assessed model accuracy. \n",
+ "5. **Calculated and Compareted RMSE:** Calculated the RMSE for the 2021 model and compared it with the 2020 model.\n",
+ "\n",
+ "By completing this module, you have reinforced your understanding of decision trees and gained practical experience in adapting machine learning models to new data. This practice not only enhances your technical skills but also prepares you for real-world applications where models need to be continuously updated and evaluated. Keep exploring and refining your models to achieve even better predictions and insights! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c71ade49-289b-4211-b75c-1d98c0af452f",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Clean up\n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges."
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m108",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/AWS/README.md b/AWS/README.md
new file mode 100644
index 0000000..2c30ee4
--- /dev/null
+++ b/AWS/README.md
@@ -0,0 +1,100 @@
+## Introduction to Machine Learning for COVID Predictions for AWS
+---------------------------------
+
+## Contents
+
++ [Overview](#overview)
++ [Background](#background)
++ [Before Starting](#before-starting)
++ [Getting Started](#getting-started)
++ [Software Requirements](#software-requirements)
++ [Architecture Design](#architecture-design)
++ [Data](#data)
++ [Funding](#funding)
+
+## **Overview**
+
+This module teaches you how to create a simple Decision Tree using a structured dataset. In addition to the overview given in this README you will find the four Jupyter notebooks. The second notebook is optional.
+- **1- Intro to Machine Learning: Decision Trees**: This notebook provides a basic introduction to Machine Learning concepts, steps for creating and understanding a Decision Tree model, making predictions with it, and intuitively evaluating its performance.
+
+- **2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data**: This notebook is optional, for students who would like to know a bit more about how to evaluate model performance quantitatively, and offers an introduction to why machine learning models require retraining from time to time.
+
+- **3- Practice**: This notebook provides a way to practice and test what you have learned from the first notebook. It includes basic instructions outlining every step discussed in the first notebook. Students are free to either copy and modify the code from the first notebook or they can choose to write it themselves.
+
+- **4- Practice - Answer Key**: This notebook provides the answers and explanation to the previous Practice exercise notebook. Check this notebook only after you have tried to complete the previous exercise yourself.
+
+This module will cost you about ~$1.00 to run, assuming you tear down all resources upon completion.
+
+
+## Background
+This module is geared towards beginners and does not require prior knowledge on a specific scientific discipline. The module is divided into three Jupyter notebooks as outlined at the beginning of this document. In addition to the notebooks mentioned, there are videos containing brief explanations about basic concepts in machine learning and what the code does in each step of the notebook. Below is an outline of the videos contained in each notebook with their respective links. These videos are already attached to the notebook.
+
+### 1- Introduction To Machine Learning: Decision Trees (10 video clips)
+
+- [Introduction Video by Lorena Benitez](https://youtu.be/e3tGQykFC5M)
+- [Objectives of Exercise](https://youtu.be/_kAjJ8rJwfU)
+- [Step 1: Importing necessary packages into Google Colab](https://youtu.be/jPIQbpdTkbM)
+- [Step 2: Loading training data and making sure it looks correct](https://youtu.be/z9dcLYg65uk)
+- [Step 3: Separate the training dataset into features and labels](https://youtu.be/qh8C0QRECWU)
+- [Step 4: Create a decision tree object and train it](https://youtu.be/M6gY_JywOys)
+- [Step 5: Visualize our trained decision tree](https://youtu.be/cFk6vmfU48w)
+- [Step 6: Make predictions using testing data with our trained decision tree](https://youtu.be/LtD93dB5JzU)
+- [Step 7: Let's see how our decision tree model performed](https://youtu.be/0VK4sLz2wrc)
+- [Step 8: Let's try using our summer 2020 tree model to predict 2021 data](https://youtu.be/2r3ZpwM6xDQ)
+
+### 2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data
+
+### 3- Practice Exercise ( 1 video clip)
+- [Walkthrough Solution](https://youtu.be/eHI4wMjSGuU)
+### 4- Practice Exercise - Answer Key (1 video clip )
+- [Walkthrough Solution](https://youtu.be/eHI4wMjSGuU)
+
+## Before Starting
+
+Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the AWS Cloud, you will need to have access to an AWS account, this module is located within Amazon SageMaker. For more technical information about AWS please click [this link.](https://github.com/STRIDES/NIHCloudLabAWS)
+
+
+## **Getting Started**
+
+**1)** Follow the steps highlighted [here](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md) to create a new notebook instance in Amazon SageMaker. Follow steps and be especially careful to enable idle shutdown as highlighted. For this module, in [step 4](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md) in the "Notebook instance type" tab, select ml.m5.xlarge from the dropdown box. Select conda_python3 kernel in [step 8](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md).
+
+**2)** You will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your Amazon SageMaker notebook. To clone this repository, use the Git symbole on left menu and then insert the link `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` as it illustrated in [step 7](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAWSSagemakerNotebooks.md). Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.
+
+**IMPORTANT NOTE**
+
+Make sure that after you are done with the module, close the tab that appeared when you clicked **OPEN JUPYTERLAB**, then check the box next to the name of the notebook you created in [step 3](https://github.com/NIGMS/NIGMS-Sandbox/blob/AWS%26GCP/docs/HowToCreateAWSSagemakerNotebooks.md#:~:text=Click%20Create%20notebook%20instance%3A). Then click on **STOP** at the top of the Workbench menu. Wait and make sure that the icon next to your notebook is grayed out.
+
+## **Software Requirements**
+
+Software requirements are satisfied by using a pre-made AWS environment SageMaker Notebook. Software requirements are described in notebook **"Intro to Machine Learning Decision Trees"** step 1.
+
+
+## **Architecture Design**
+
+Submodule 1 and Submodule 3 will download CSV files stored in an Amazon S3 bucket to the SageMaker notebook, then it will output additional CSV files that will be used optionally if students want to work on the (optional) Submodule 2. Below is a diagram that illustrates our workflow:
+
+![Architecture-diagram.PNG](images/workflow.jpg)
+
+## **Data**
+All original data from this module was originally sourced from the following sites:
+
+- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)
+- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)
+- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)
+- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)
+
+We subsequently picked only certain variables of interest, cleaned and created a composite dataset for the years 2020 and 2021 from the sources listed above. **We manipulated the variable named "Unemployment_rate" by using the 2020 rates in both the 2020 and 2021 Datasets**. We then separated these datasets into training, validation, and testing sets for each of these years to streamline the tutorials. Finally, we stored them in our group's [SFSU GitHub repository](https://github.com/MarcMachineLearning/Introduction-to-Machine-Learning/tree/main/Datasets).
+
+## **Funding**
+
+- SFSU/UCSF M.S. Bridges to the Doctorate Program: cloud-based learning modules supplement (T32GM142515)
+- Demystifying Machine Learning and Best Data Practices Workshop Series for Underrepresented STEM Undergraduate and MS Researchers bound for PhD Training Programs (T34-GM008574)
+- The creation of this training module was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number 3T32GM142515-01S1
+
+## **License for Data**
+
+Text and materials are licensed under a Creative Commons CC-BY-NC-SA license. The license allows you to copy, remix and redistribute any of our publicly available materials, under the condition that you attribute the work (details in the license) and do not make profits from it. More information is available [here](https://tilburgsciencehub.com/about/#license).
+
+![Creative commons license](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)
+
+This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/)
diff --git a/images/COVID-Decision-Tree.PNG b/AWS/images/COVID-Decision-Tree.PNG
similarity index 100%
rename from images/COVID-Decision-Tree.PNG
rename to AWS/images/COVID-Decision-Tree.PNG
diff --git a/images/Clone-a-Repository.png b/AWS/images/Clone-a-Repository.png
similarity index 100%
rename from images/Clone-a-Repository.png
rename to AWS/images/Clone-a-Repository.png
diff --git a/images/Features-for-Prediction.jpg b/AWS/images/Features-for-Prediction.jpg
similarity index 100%
rename from images/Features-for-Prediction.jpg
rename to AWS/images/Features-for-Prediction.jpg
diff --git a/images/GCP-New-notebook.png b/AWS/images/GCP-New-notebook.png
similarity index 100%
rename from images/GCP-New-notebook.png
rename to AWS/images/GCP-New-notebook.png
diff --git a/images/General-Decision-Tree.png b/AWS/images/General-Decision-Tree.png
similarity index 100%
rename from images/General-Decision-Tree.png
rename to AWS/images/General-Decision-Tree.png
diff --git a/images/Jupiterlab-terminal.png b/AWS/images/Jupiterlab-terminal.png
similarity index 100%
rename from images/Jupiterlab-terminal.png
rename to AWS/images/Jupiterlab-terminal.png
diff --git a/images/Label-and-Features.jpg b/AWS/images/Label-and-Features.jpg
similarity index 100%
rename from images/Label-and-Features.jpg
rename to AWS/images/Label-and-Features.jpg
diff --git a/images/Model-performance-comparison.jpg b/AWS/images/Model-performance-comparison.jpg
similarity index 100%
rename from images/Model-performance-comparison.jpg
rename to AWS/images/Model-performance-comparison.jpg
diff --git a/images/New-Notebook-config1.png b/AWS/images/New-Notebook-config1.png
similarity index 100%
rename from images/New-Notebook-config1.png
rename to AWS/images/New-Notebook-config1.png
diff --git a/images/New-Notebook-config2.png b/AWS/images/New-Notebook-config2.png
similarity index 100%
rename from images/New-Notebook-config2.png
rename to AWS/images/New-Notebook-config2.png
diff --git a/images/New-Notebook-config3.png b/AWS/images/New-Notebook-config3.png
similarity index 100%
rename from images/New-Notebook-config3.png
rename to AWS/images/New-Notebook-config3.png
diff --git a/images/Shutdown-machine.png b/AWS/images/Shutdown-machine.png
similarity index 100%
rename from images/Shutdown-machine.png
rename to AWS/images/Shutdown-machine.png
diff --git a/images/Summer-2020-model-performance-comparison.jpg b/AWS/images/Summer-2020-model-performance-comparison.jpg
similarity index 100%
rename from images/Summer-2020-model-performance-comparison.jpg
rename to AWS/images/Summer-2020-model-performance-comparison.jpg
diff --git a/images/Testing-Data.jpg b/AWS/images/Testing-Data.jpg
similarity index 100%
rename from images/Testing-Data.jpg
rename to AWS/images/Testing-Data.jpg
diff --git a/images/Training-Data.jpg b/AWS/images/Training-Data.jpg
similarity index 100%
rename from images/Training-Data.jpg
rename to AWS/images/Training-Data.jpg
diff --git a/AWS/images/workflow.jpg b/AWS/images/workflow.jpg
new file mode 100644
index 0000000..4cbff21
Binary files /dev/null and b/AWS/images/workflow.jpg differ
diff --git a/quiz_files/quiz1.json b/AWS/quiz_files/quiz1.json
similarity index 100%
rename from quiz_files/quiz1.json
rename to AWS/quiz_files/quiz1.json
diff --git a/quiz_files/quiz2.json b/AWS/quiz_files/quiz2.json
similarity index 100%
rename from quiz_files/quiz2.json
rename to AWS/quiz_files/quiz2.json
diff --git a/quiz_files/quiz3.json b/AWS/quiz_files/quiz3.json
similarity index 100%
rename from quiz_files/quiz3.json
rename to AWS/quiz_files/quiz3.json
diff --git a/Google Cloud/1- Intro to Machine Learning Decision Trees.ipynb b/Google Cloud/1- Intro to Machine Learning Decision Trees.ipynb
new file mode 100644
index 0000000..5461b28
--- /dev/null
+++ b/Google Cloud/1- Intro to Machine Learning Decision Trees.ipynb
@@ -0,0 +1,2163 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "85be35e8-c134-4ba4-ac3d-fe95bc106ff4",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# **Introduction to Machine Learning: Decision Trees**\n",
+ "\n",
+ "# Overview\n",
+ "***Introduction Video***\n",
+ "\n",
+ "It's important to note that there are other machine learning techniques, but the aim of this notebook will be to have a basic understanding of one of the fundamental techniques used: Decision Tree. This is ideal because Decision Trees are the basis for more complex models such as Boosted Trees or Random Forests. Below we have a general introduction video to machine learning by Lorena Benitez.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "bfb2d237-44af-43c8-8968-eaa340d7ecc3",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGRoeHRsfHx8gICAgICUlJR8lLicxMC0nLS01PVBCNThLOS0tRWFFS1NWW1xbMkFlbWRYbFBZW1cBERISGRYZLxsbMFc2NT1XV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXXVdXV1dXV11XV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAQIDBAUGB//EAEoQAAIBAgMFAgsFBAkDBAMBAAABAgMRBBIhBTFBUZFh0QYTFBUiMlJxgaGxQlNyksEHFjPSIzRUYpOisuHwJEOCF3OD8WN04jX/xAAYAQEBAQEBAAAAAAAAAAAAAAAAAQIDBP/EACMRAQACAQUBAQADAQEAAAAAAAABAhEDEhMhMVFBImFxoTL/2gAMAwEAAhEDEQA/APn4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOlDYdWTsnDq+4mpsOrF2cqfwb7jey3xndDmA6Pmar7UOr7iY7Eqt2zQ+LfcOO3w3Q5oOvPwcrxV702uxvuMPmar7UOr7hx2+G+v1zgdHzNV9qHV9xkWwKzV1Km/i+4bLfDfDlA6T2LV9qHV9xHmar7UOr7ibLfDdDnA6Pmar7UOr7h5mq+1Dq+4bLfDdDnA6Pmar7UOr7h5lq+1Dq+4bLfDdDnA6PmWr7UOr7jKvB6ta7lTS7W+4uy3w31+uSDqeYqtr5qfWXcV8yVfah1fcTZb4b6/XNB1I7ArPjDq+4S2FVX2qfV9xdlvhvr9csHUjsKq/tQ6y7iHsOr7UOr7hst8N9frmA6XmSr7UOr7ifMdX2odX3DZb4b6/XMB114OVrXzU7e+XcR+71b2qfWXcNlviclfrkg6fmOr7UOr7h5jq+1Dq+4my3xd9frmA6fmKr7UOr7iXsGr7VPrLuLst8TfX65YOp5iq+1Dq+4eYqvtQ6vuGy3xd9frlg6fmOr7UOr7jI/BytlzZqdvfLuJst8TfX65AOpHYNV/ah1fcR5jq+1Dq+4bLfF31+uYDprYdX2odX3EvYVX2qfWXcNlvhvr9csHTWwqrfrQ6vuI8yVfah1fcNlvhvr9c0HUjsGq/tU+su4S2DWW9w6vuGy3w31+uWDpeZKvtQ6vuJjsOq361PrLuGy3w3w5gOtU8Hq0dc1NrmnLuMXmWr7UOr7i7LfDfX65wOj5lq+1Dq+4eZqvtQ6vuJst8N0OcDqz2BWSTzU9e2XcUWxavtQ6vuLst8N9frmg7n7rVvvKP5p/ylf3ZrfeUfzT/AJRst8Tkr9cUHa/dqr95R/NL+Uj926v3lHrL+UcdvhyU+uMDsfu7V+8pdZfykfu9V+8pdZfyjjt8Tlp9cgHX/d6r95S6y/lH7u1fvKXWX8o47fDlp9cgHY/dyr95S6y/lIl4O1Um/GUnbtl/KOO3w5afXoV/RQ/vP5I1b3E5uTuyD2RDjEJCMGJm405tb1FtdDiedK3tfJGLakRPbUVmXp6OIlB9nIzOnCrrH0ZcjyXnSt7fyQW1K3t/JGeaF4pekqQcXZoU5tO6PPy23iGrOaf/AIx7ii2rW9v/ACoc1Tjl6mSU9d0vqa8k07M8951r+3/lXcTLbFd75/5Y9xJ1anHLvg8950re38kPOdb2/kicsLxy9EWhTcnZI8350re38kXjtnEJWU0v/GPcOWE45ep9Gn/el8kYZ1HJ6nmvOtb2/wDKh51re3/lQ5YOKXp4QbWiJtGPazzfnzEWtnX5Y9xj87V/b/yocsJxS9NKbZU8352r+3/lQ861/b/yocsYXjl6inF70UkrM84ts4hfb/yx7ju0arlShJu8nGLb7bG4vFuoYtSa9yzAxZ2M7Lhlv4e7g0uBjk2pK5rQryjudhKvJ738jWemcdss1qyDE6rfEjOzPbTZpK8kZqy9HdxNFVHzLOtJ72WJ6YmO1yY7zDnZKmzMS0yveblaMlBLS1tTnKb5mSWJm+PyRqJSYZYtKLdjHZczH4x7iMxMqyZGQVUmS6jHSrRlYtaL7H8jDcXESMkqbW8mNRrR6rtKRqNcSHIdDNkUvV38i2Gp+k771wNeNRp3W8sqrve+pYlJ7jDaq1HFJ896MTpxmrx0fIwzqOW93Kp21RZlIjEEk07MJX0LTqOW8rGTTuiN5Z6tOeVJ6pciYwjTWaesuETFHEzW5/JGOUm3dv4lzDERM9NieOir6SbXC2pV46Fru63KzTvqYlgoWbdVcE2mtGT4ulDKrTk777SsuPI8/JZ0jSqzzqxSvq3yUZXXv0NZYtS9WLe+65F51JrRUmuSyt2fx+BjlVq2u8yfFJLdzatfeZ5rNcVfjawWHeJqQp07wcm7ylH1Ulduxs7S2YsM1ecqkLpN2jmV9ztpdGbwTSqVWp5lenPi0/WXPUzeEDhnqX9mKWr3rRfOxztq3znLrGhTbjDj1Zwg9VVtweSOr5L0iFVhpanX+NNafM16dRVKmSo27tOKvfK7dGi8qklNxkoyssvquTS7bWR05pco0afG/TpRlFSSkr+1ZPoSqdNJtu6W8ihGnKEZNRUlG7SSsmr2W7d9DJKCUPSSbSunHt7Sxq2mVnRp8coNkMg9U/GGLGfwqn4X9DzR6TGfwqn4X9DzZ5tX120/AAHF0AAAAAAAAAAAAAAAAAAAPUYX+BT/AAR+h5c9Rhf4NP8ABH6HXS9ctXxcAHd5wAAACUAAJQCwJAUBJNgKk2JsTYCthYsLBEWFi1iLBUWFibCwFbAsGgKkFrCwRUE2IsFQCQgM0VJJZd7k3msmlwb5vh1Ili21HT0U7Sb3z013bjPGhB+lHetG4y0VuCd+HaY1CTs4OHDMk9FrdW4XueLL2Jlh7u1rJqze+75Liv8AYipVyQUnFNtLsulez92u46mG8WnmrX8Ve+izLdfle2ljhbZxUZVP6NNRulFPfaxYjLdYzPbY2VVqSxFGUE7qaTt7Lumb3hph5vLNu0cy+n/2ZvAbDXqVakvsxSXvk3+iRteHithqfbN/6WZmvbc4zh5DDSpzyxcFF3XpJyu0t6S5v9TryqNyjGnBU8ylms8yl2rfbl3Hlo1ry6HbwNTxtJQd7Rlo4R1S5t7muaLPTnaHQwlaybcXFLN6EsrceG61nvZNao9Y5W1LTO9Errl3XNrC7Pp1YRhKtOMtHlipWtwatw4Fa2zbb56pW36u/O/N3ZaxmYYmenDAB7nmYcZ/Cn+F/Q82ekxn8Kf4X9DzZ59b2HbT8AAcXQAAGXC4adapClTWac2oxXNs+i4D9nWHVNePqTnUa1yNRiuxaanB/Zvh1PHyk1fJSk1720r9Gz6kB872t+zqcbywtXOvYqaP4SWj+R5WvsLGU5ZZ4asn2Qk11Wh9uAHxKnsHGS3YWv8AGnJfVGSXgzjkr+S1fhG59pAHwfEYSrSdqtOdN8pxcfqYT73WowqRcZxjKL3qSTR4/b/gDSqpzwn9FU35G/Ql7vZ+gHzQGXE4adGpKnUi4Ti7OL3oxAAAAPVYa3k9Lnkj9Dyp6vDpLD0ueSP0Oul646vkAAO7gAEgCUQSAJMlOhOSbjCUkt7UW0iiC4LEpAlARYmxNiQIsTY0qm0IrOrWlHRX4sUtoxajf1pOzSW4mYXbLdsLEgrJYWJFgqLCxNi1OnKclGKbk9ElxApYix0K+x8RTScqbtp6tpfQ0ZJptNWadmnwZImJMSpYWLE5PgUY2iLF3EiwFLCK1Rv4LAKazSbtwS4kY7AKEc0b24p8DjOvTds/XaNC+zfjphxNe6UYrc5rRNS36Zl8EU8qlplcY2i9Prx195WGInHSM5K++zauJYio99ST4es9xY04iF5YYpbRnWjKDm4UlZtRvHxm9Jb9VdbzmTquda992vvZ1ZzcvWd9La8uRh8TFbopfBEjTw1GvEfjY2LtGMJVI+lmdnZNq6sT4U4xunSvfTMrOTZrwgottJJve1xKzoxlvinx1SNbE5u8uROhFUqVWM03NzjKOl4NPTqjv4KblRioycLJZ16KVr2evbo7GtHDwW6EeiNudNeLU4JJrSVufBmeGZ/UnWj43aUqqtN7016XH39i32T5G3W2jUyScmnePLLp28+ZxaWIqJ6Slrv13kVK83dOTtyuWNKY7JvE9MNwiGSd4cmHGfwp/hf0POHosZ/Cn+F/Q86efW9dtPwABxdAAAet/ZvTqPHuUF6CpyVR8k2rfG6+p9SPPeA+zFh8BTbVp1f6SXPX1V0sehAA8p4ReG1HCSdKilWqrR6+hB8m+L7EeOreGu0Kk1asqavuhGKXzuB9cIPk9Dw3x9KbvUVRX3Tit1+yx7Xwc8MKGNapyXiq3st6S/C/0A9IQSAPN+GPg3HG0XUgrYimm4tfbXsP9D5K007NWa3o+/Hyfw/2WsPjnOKtCss6S4S3S+evxA8wAAB6rDfwKX4I/Q8qerofwKP4I/Q66frjreQAA7uASgSgCNrBYGpXllpxbtbM9EkvezVPQeDW0qVKM4TkotyzJvc9LWv8PmZtOIarGZerw9GNOChBWjFWSRr43ZVCtG0oJWu046NN72YPO9D76nvS9Zb3uRk84Q9uPU83b09PH43B1KFSUJRurvK3H1lfemjXdPik/dyPUbapwxVOEc+W0sycZW3Jp/U5VLwaztqNWo2t9qj7jtGo5TRzMr5Poam0m1DRuMk00lvep6P905e3V/xH3EfunL26v+I+4byKPLbP2Z41OpUzNNuyWjfNtkbQ2d4m1Snmtfc9XF8GmeoqeC0optzq2X/5H3GqtjRf/cq/nJHazOJc3Z2eULzd23ddi7Tb0W7qbGz9nN4irSTlKMfFNuUtVGWbNZ/BHVqbCjlmoSWZyTi39mPJmouxNc+ODcWMuJo+LqShdSyu11xMaOjCD0/g/sqVF+NqWvKNkuMb8zzB6Sl4Q08qzKSlbWyurmNTOOm6Yz29CcvbOyI16bcIxVXenuvzTZ5naEvK8XN+MqKCp03FRllte99DGtkRf/drfnOUVn11tavksUqE4TcZRaadmrFZQa4PobfmKP39Tj/3Hw38DPHwWlJJqpVae5+Mfcb3sbIcxxfJ9CuR8n0Ov+6c/bq/4n+w/dOft1f8T/Yu842LBT/o8uqaa6NjH1LUnHVuTt0er+Rr7R2HVw8b051XPgs+9cTDR2bUlSlUrVKqmlJpZ+C3Hl4c6m56ufGls/fGoC61jGTd24Qb97gm2yLHth4JUDJKlQIJIAkz4aolJxfqyVmYCr3moMZbGTK5R4pmKrvNip6cYz4+rI1JbzU+Fe1SbNmVTg98be4yU4Rd7S6khqZc/F/wp/hf0POnp8dQkqVR71llu9x5g82t676fgADi6BubIwTxGKo0V9ucU/dfX5XNM9n+zbZjniZYlr0KUXFPnOXcr9UB9KhFRSS0SSSXYeY8OtvPCYdU6TtWq3Sa3wjxfv4HqT5F4e4p1NpVU91NRgvda/1bA862dDYOBliMZRpRV7zi3+FO8n0OebeytpVMJXjWpWzxvvV009GmBueFeAeGx9aDjaMpOcOTjJ6fqvgcmEnFpptNNNNb0+Z6rZLe2tpp4v1VTby0/RSUdy912bXhv4K0sNCFbCwajqqkb3stLS11426Aer8DtueW4VOb/pqfoVO3lL4953z5b+zfFOGPlT4VKclbtjqv1PqQA8Z+03D5sJRqW1hVtfslF/qkezPLftF//wA1/wDu0/1A+UgAAesoL/p6P4I/Q8meso/1ej+CP0OmnPbhreQgAHocUkoglACUQSgq79Rf+7R/1HdOE/U/+Wj/AKzvGP2Wp8hv4NLxa05mxHTdp7jXwl/FaWvra+65r6v+sOS7FpT6r9ThaO3eLYrDfq4hxSbctWkrPiyynJ8X1MOROdOOlopz03brL6voY6mRSfi3LPfVQ1V+2+hMLluSvZ3bej4nLnOMVeTUV2tJfM6NNyyenbNZ7jze1KUZ4jDxmrxy1dOhunTF+8NjC4lQrYuanBZqVJQeeOrWa9te0wYbEZKsJqpFarN6cdVfW+o810Pul8yfNdD7qPzN7Zc91W1tutCcouFSi4pP1ZxvfjfU5el/Wj+ePebPmuh91H5jzXQ+6j8yxuhJmstZ29qP5o9405x/NHvNnzXQ+6j8wtlUPul8y5sfxYsJJKtNuUUnTppPNHer3W86VN3cbaq61W7easdj0ONKPzI2PFKDS3KvUSXJKW4ncdSs4nuHVnibNrI+PAt5ZK2ikuCdnZfC5r1Kksz14vl3j+k/5b+Y82J+f9ejdX7/AMZZY6ok25aLe8su8wx2vUfqqpJe1GlJp+55tSs6bq1Mn2YqLadvSk9y37lZv4o3PJZPfZe9J/qWIn9SbR+dscsSqitmvJb04uMku1PUwYn+HP8ADL6GfaOFtBOGk43cXz5p9jNapUUqMpLdKm2vjE718crevLUsUs0adm3lp68EvFRNg5+HpzliEqe9wo6au68XDRJHpKOwqs4xlpaUXL3W+y1zZqs9dpevfTlSKsz4nDzptRnFxeVOz7TCzbmggkgIGdYSUlmbUVzbMBWUm97NQYn8b0XThCUc2ZvkjRlvL09xaVCT3Jmp7grGJYWrGSEFbUyLF+1FMyQqwa9S3uEYkmZc/GtqnUV/sy+h5s9Vj403SqtSd8ktH7jyp5tb16NLwABwdXS2DsWrjq6pU1aK1nPhCPN9vJH2LZmz6eFoxo0laMV8W+LfaeI/ZxtijTU8LO0Kk55oSf29Est+enzPoIEnx3w2puO08Rdb3GS9zij7EeC/aVsdyUMXBXyrJUtwV/Rl87dAPE7IrU6eKozqwz04zi5R5q53PDvYsMLiI1KMVGjWjdJboyW9L5M8uer8Hdu4zE4ijhpyp1oNpWrU4zyxS1a43sgOd4JvFLGQ8k/iWaldXiove5dh9X2jg/HYarTdnOdGUL24tP8AUx4nDvDQc8JhqcpNpzirU3KK5aWb7GbWKbdCb1TdOXvXogfMP2e0W9px/uQqN9LfqfVzxv7OdkOlQliZq0q1sv4Fx+LPZADyH7S6jWBglulWjf4RZ68pVpRnFxnFSi9GmrpgfAweh8NNhRwWKXi9KVVOUF7LW+P06nngB62l/VqH4F9DyR66n/VqH4I/Q3T1w1vIUAB6HFJKIJRRgqa1YRdRU4u95N6L3mCDbjUfjrOGW0W9al3bT3bzfUU76cDe2bgYVnJSussM2lt97cjFo/XWs/jg1JyUINVszlduKesHF6X+o84Vvvp/mZ7OHg5Rkms01aS19HhryNpbEh7b/JS/lOWXXDxuzNpV3XpRdaplc1dZnZntKcm2rSbV7NX/AEIWxorVVJJ/gp/ymo8NUlOpTpzmpq2WU4w8W21felc5XpumJiXSurFIxMOtGmlJy4tJe5IucXzPtL7+h+WXcPM+0vv6H5X3Gume3YnufuZ5zH/1rD/hq/obfmfaX39D8r7jQxWFr0sVh44icJycarTgmla0ew3WYYtE+s2Pm40Ksk7NQk0+TsYqE4UYU4+lKU1msryk9Fd68CNr0VKhOTveEZSWul0r6rc93EtaeaFWMc96ajJJpNbmmr6HZwjxkqYyMbK023HNlUW2o82uAeMh6OXNPMsyyK/o8yFSqKo6qhfNCMXHMrxabfTX5E4fCyo5Msc/9GoOzSs02768NWMriDDYi9KEpXcpZrKK1dm+HusMY4ukqsbqUZRV7tW9JJprvIjhJqNNuLbipqUYzyv0pXunfsLYmGTDTSjlu1dN5nrJJ3epMrEQ3zn7JXoy/wD2Kv8AqN2lSUEoq9lzbb6s0Nm4DFVYTdCpThFV6yamndvN7mZtOFrGXZlRnd6/KPcT4ie7N8o9xo+Z9o/f0Okv5R5n2j9/Q6S/lOOIdmxh6Mo16sVKzahLctVaz4c18zb8VU9v6dxynsPaDlGTrULxvZrMtHvXqmtPD7Si2n8Gqbkn7nFDEfVz/TtYp5IJyd7XbfYcuEWsLZ7/ABT/ANJqYqli4Um8TNLNpSpqKc6k96TXLmbNDZuOr0lKNfDuMk07a24NXStdbjcTEMWrMzloeDuCpvE06jz51TpuLjuv4uO/stp8TobR2nXVeSjJxUZWS4ac+ZoxU8JXVNy1gqcJuHFKMbpX9x2Hj8HVcalSKUnJx14JbpSXI1j9YmWv4QU81KnW8X6UlHNK+7TSNvj8jz7OltfHqvJWWXLdes2pfA5rOlfHO3qCCSDTIbEZUYpXUpMwIyRwNWeqjp2tFhcZZo4xL1acV79RiK0p01K9rOzsWp7KqcXFfEzw2daMoua1tuW43lnZ3049jYp7katzLRimmImGrMOO9Sp+GX0PNnocY/6Kf4WeePLreu+l4AA4urZ2b4vyil45tU88c7W9K+p9zpTjKKlFpxaTTWqaPgtJxUouSbjdXS0bXFH2zYGLoVsLTlhk40kssYtWy20sB0SlWlGcZQmlKMk009zXIuQB8y8I/AarRk6mETq0nd5Ptw7O1fMt+zzCeLxGIxFZOEaNNp5lbK29Xr2J9T6YUnTjJNSimnvTSafvA5Pg5tWniaCkqsZTlKcnByWaF5NqNuxWOxJJpp6p6Mx0sPCHqQjH8MUvoZQIjFJJJWS0S5Eg834Z+EKwWHyU5WxFTSFt8FxkB6QHlNheHGGr04rETVGslaWbSEnzT4fE3toeF2BoQcvHxqPhGk1JvpoviB579qNWOXDQ+3ecvhoj58dHbu16mOxEq1TThGPCEVuRzgB66mv+mofgj9EeRPX0/wCq4f8ABH6G6euGv+MYAPQ4pJRUlFF4vnuZenVlDc5RfOL4dxiLRk1/urhXd8HcalWkqlSTTi7Z5aXuj03j6Xtw/NE+fqXC/RENHO2nmXSt8PoPj6Xtw/NEtCVN+q4O3Jp2Pnlj1Pg7s6UIVHUi4uTja/FJf7mJ08frUXzLpYjBueIo1lWnFU1K9NP0al1x9xtyqRSu5JLm2kYvJI82am1MBnw9SMbuVk0udmmc3XLd8pp/eQ/MjzfhDNSx2FcWmvF1dzT5HGnTcZOMlZp2a5M6GztieVQzqrKlKEpRWVLW6idNm3tz37umxKlmTTV0000+KJw2EjSjljmt/ek5W7NeBl/dWf8Aba//AD4k/urP+21/+fEvJCcc/QD91p/26v8A8+JVeDErteXVrrerq6+Y5DjlcxVsPGbTld24XdviuJf91Z/22v8A8+I/dWf9tr/8+I3nHK5fwWqxjRrZpJf9TW3tLijF+60/7bX/AOfE5m0dmrCuNPO53vNyktfSet+hM75wuNnb2XlEPbh+ZF1Nc11PBUMNKpJRhFye+y5cz2tHBxUIp30il8jNq7Wq33MezsK6KqKVeVXPUlNZ36if2V2GzKvBb5xXvkkU8ljzZ5/wkwMlJVIxbgoWk+Vnp9SRGZatOIy7OPhRxFGdOVSKzJpSUleLfFHlPB7aUsFiZYas14uUrXTvGMuEk+T7jSW5mvi6OZXW9fNG503ONR0tsVYzxVWcXeLkrNcbJL9DUU+1r3bjXw0m4JtO25PnYyHWvjlb1MmVYDKyggkgqB2sPN+Lj7kcU6uHfoR9yDdWwpPiRxMdybh0cAyUb62VzHJ6mzSforXcbj1xtPTRxn8Of4WefPRY93hU/Czzp5tb130vAAHF1D694COPmujlto539+dnyE2cHtGvQd6NWdO+/LJq/wAAPuwPjlLww2hDdiZP8Si/qj3Pgh4SVsXKdHE03GpFZoyUHFSXFPkwPVAAAAAPHeFHht5LVnh6FNSqxtecn6MW1fRcd6PnGNxlTEVJVasnOct7Z6Hw52NVo4meJlZ061SWVp6rTc/+cDy4AAAAAAPXU/6rQ/BH6I8ievp/1XD/AII/RG6evPr/AJ/rGAD0OQSQSUSTHeVRIF4xfBF5P6sop80n1IcgrNSozn6kZStvsmzJ5JW9ip0Zt7Ox1ONLxc6k6es75Y3U80bK/G6OEti0v7TF/wDwzMTM/G4iPrp+R1vu6nRmzgNk1asrSzU4re3fojh+ZqX9oj/gzPS+CVKnRjUpRqKcpSz+q46WStrv3fMzNp+NRWPqMdsyjTTUa6zrVxm1r3Gfwf2lSownCpLLeWZOzaeiVtPca20dk1fGVKmjhdyzOSWm85Vy4zDPkvbee8N96uku4ee8N96uku48TcvSWaSjdK7Su3ZL3k4oa5Ze7oY2nUjmhJNbr6mvRwlCGIqYiN/GVFFSd3ay7PgauzJ4elQhCdalmV72mrats2vLML97T/OjlMdusW6Za+0KNO2eeW97aMw+e8N96uku45+2/EVqcfF1qWaMr6zWqt/9Hmrm60zDFr4l7bz1hvvVz3S7jzu3sbCvWTpu8Yxy3ta7u3oc5O9rb1u7ewe6Lvy4HSKYnLE3mYwRTe5O9+G8tUvGV2nrrb9C+HxkqLvCzb9a6umZXtis27tJPlGOnu0LOWIactIyl9mO+XBe81442m1fOvi7G/i8dOrRqUpyi4zVr5UpRs76WOKtlQS1nJvssl8yd/GumxDG05ZrP1bN6cLpL5yMzNKjgMqqJyvmUUtN1pxl+huNljP6k4/Bvd2EXBDNMlyAAIAAQOpQ9SPuRyzqUX6Efcg3RcXCZDH46OCZ6MU077zAZaKlq0jcONvGti/4c/ws4B38X/Dn+F/Q4B5tb130/AAHF1AAB1vBfZnleOpUmvRTzz/DHV/ovifaErHiv2abMyUKmJktajyR/DHf1f0PbAAAAAAHM8IdlrGYSpRfrNXg+U1qv+dp8VnBxbi1ZptNPg0ffT5X+0HZHiMX46K/o6/pe6a9Zfr8WB5QAAAAAPX0/wCq4f8AAvojyB6+kv8ApaH4I/Q3T159f8/1jAB6HMAAyJFyCSibggAWBAAsTGTTum01uadmioA2KuLqTVp1JyXJybRjuUFyC9xcrcXKi1wVuLjKrArcXGUWYzPdd2K3FxlUi5BFxkS2RcgDKFyABlS5ABAIJIAAAuUDoUn6Efcjnm3SrwypZldLjoHSjZiybmOM09zT+KLah0cSb1ZtUZLKtd281DZpUo5U3rc6V9ee/jUx7vGpb2X9Dz56DHRtCov7rPPnl1vXo0vAAHF1b+Bq4SK/p6NWb5wqKK/0s6NPF7J+1hMR/jJ9x58AfRsF4fYKhShSp4evGEEoxXoOy9+Y2F+0fCfdV+kP5j5iAPqC/aLg/u635Y95P/qHgvZrflXefLgB9S/9Q8F7Nb8q7x/6h4L2a35V3ny0AfUX+0TB+xW/LHvNDanhls3F01Tr4evOKeZL0VZ87qR89AHqp4/Yj3YPEJ/j/wD7NSri9lfZwmI/xkv0ZwABu46rhpfwKVSH46il+iNIAAeupf1ah+CP0PInrqX9Wofgj9DdPXn1/wA/1QAHpcwAAAAESCABJJAAm4uQAqwIuLgSLkXFwibi5AYVNxc2cfh4U5pU6iqRcVK64cGm9z1T3GsSJyTGC4IBRIuQLgTci5ACJIAAEAAAAAAAUMEt5nKvxb4tGogzhhJUnwbMni4PdPqiY0Vdekma2yboQ8PzaRkhUjFWcrmviPWb56lEieSYzHa+KyyhNK7k00r8zkU9l1JOyy9TrSozyt5Xom7nP8dJbpNfE46uM/yddPOOmCWzKidvR6lobKqy3Zepk8dL2n1LRrzW6Ul8Wcc0dMWYZbKqp2eXqXpbFrT3Zfi33GTymfty6sjyifty6suaf2n8/wClamxKsd7gv/L/AGL4fwfr1E3HJZcW33EPETe+cn8WXhjKsVZVJpclJjNFiLfq78F8Tzp/mfcR+7OI50/zPuI8vrfe1PzMjy+r97P8zJ/Bf5LPwZxHOn+Z9xH7tYjnT/M+4jy6t97P8zHl1b72f5mP4naf3bxHOH5n3Efu5X5w/M+4eW1fvZ/mY8tq/ez/ADMZqdp/drEc6f5n3FP3fr84fmfcW8tq/ez/ADMmWIrL1p1FdXV3JXXNdgzQ7V/d2vv9D83+xXzDW5w/M+4v5ZV+8n+Zjyyp95P8zE7fxJ3/AIQ8HMQ92T8z7j0EMDNUacHa8YpPXsOB5dW+9n+ZnptnzcqFNtttxV297N0w53rM/wDpr+b6nZ1Hm6p2dTpxLNnVzw5Xm6p2dR5uqdnU6qYuZ3Sy5Xm6p2dQ9nVFy6nWMVSd/cSb4hi99sOYsFPs6k+Qz7OpvkszF7S48tnO8hn2dR5DPs6nRJLFplOWzm+RT7OpPkM+zqb5LJF5Xls0FgKnZ1IeBn2dToKdiJNstbzLXN05/kcuzqPJJdnU3LkXEXnPbPLZq+RT7OpHkc+zqb1OfBkyZvL01mJjLnrDS7Oo8ml2GzCe/wB5MJekixOWqxmMtKdNreVsZajzSfvKqSb3By3Sqo+5e8Om99tCa1nZJ9ScsoxVr/A1hcsLYzFpVeaT+pX0Xzj80XBEylalcyLNZU3dO+mhhExhYnLNwvwIi7uwr6WjyXzK0tLsTBHaXJEeMRjbLRoyauloMNLKa5mSML7pR6ms6bXAqTBhvPDS5J+5mvUw87+qzFGbW5tGSOLqL7T+JrEJiWNxa4MqbSx0uKi/ei7rwlFtwWnI1jpcz+wr46kkvRzNcyHjrerFL4Gok724mRUJe73kzMpiP1NfFzlFpvSzNnwaoUp0sZ42ySjQtLIpuN6nBdu41KlOKi/Su7Pcc+FacVJRnKKlbMk2lK2qvzPNr+xl30sY6e02t4OUJ1MRV9ONo15WhZQTpwhl0tpvZTF+CuFjGqoeOzxjWytzTTlGMWtLf3rHlHtGu1JOvVtJtyXjJeldWd9ddCJbQrO969V3ve9SWt7J8eNl0ODs9TW8F8PBpyVRJUq8pxjUv6VNwslJxW9SfA1treD9ClSxLpKpKpRk280sqhDSzXo2mtXfW5wJ7Srz9avVejWtSW5pJrf2LoJ7RrSg4SrVJQk7uLnJpvtuwPSYjZ1KrTws60nGMcJhI6PL67leXqu9rbuLe81MNsCnKNHO5JyqU4yyyveE4TkpaxWV+itLy36nFjj6yUUq1RKMXGNpyVo8lZ7iFjaqUUqtRKPqrPK0fcr6AdF4ChLC+NiqkZOlUqq800lGsoZWsqvdPf2GxShSlgqKlGajGjia8lGcVnlCplV3lfP4I4Sryy5VOWWzjbM7Wbu1blfUKtNKylK1nG13azd2vc3qB3/MdHOo3qWjPJN5o/0l8PKrmhpprG3HRmptHZ9GNGE6WeLlOinnkpJKpSVTglu3dpzfK6norxk/QTUfTl6Kas0uWmhWVaUlllKTWmjba0Vl0WnuA9hU8FsJ4+NFTqKVpqTzJ2tFSVR3VktbW7UYY+DWHlTlZVoTy4pxlOSyw8VUyrMrcV2nm57RrySjKvVcVHLZ1JWy8t+42MTtutVw8KEpyyxz5nnk3VzO/p87Ad7E+DmCp1Jpzq2pUq06kVJOUvF5XmTcbK6b07UZ57CoVVGrWq1XDxOHUc0taakpNbo6paJL36nkqm0K80lKtUklFw1nJ+i98d+52XQU9o14O8a1SLyqGk5L0Vujv3ID09DwWw04Ukp1FOUcNOU8ycWqma6iraerp70YYbDwksPUrPxtHWpCKnK+Rwje7WXW/wANDznltWyXjalllss8rLL6vHhw5GSe1MRLPevVedWnepL0lyeoGmj1mzH/ANPS/Cjyh6bZs/6Cn+FHTT9Yv46EZkuRhuSpHWbPPLNGROYweNsys6jZM9ONtSIZKlS+nAi5juEzhMTLz2mbTmWQkpcXN4iKsLXJZjuTczXAlsZiobNLMDkTmMdwK9EQs2QUcib2LDeEMiUr8dSkplW7lhqOl1O109Cs6qWiIlPSzKuNtVqdHWLzEYUk7K3FkR01KrUrmu7IsEQnJcRlKO5uwqyssqMam0a/Wmd1Yv1o/FEKlGXqS15MxuSe/QmMLekteXvL7IidOS4FaK1u9y1JVSS4mWVWOVKS1ersX9GvKV3fmXm7RS+JaNKG9S05MwVJXZJlqBK7M2JlZKK4EYaOrk9yMNWeaTZfw/Uqs1xLePvvSZhBGsM94PmiHRT9WS+JhBehldCS4X9wmnGKVt+pSnJ3STZmq4l3to1u3Go8Sc5YVO079pbEXzdm9GG5mfpU+2P0JHhMY7YjJh6Tb9FL4mOnBydopt9h1sFs7K803/4r9TP+rOcdNCjhJVJNJcdXwR2MLg6dNbk3xbRm0SslZdgiyNR1Cs4xv6seiCpRa9WPRFagpTswv4rOlFfZj0RVwj7MeiMlZmOL1JMLCFST+yuiLeIjfVLoiak+hSMupcQZlapTjbSMeiMHi17KS9yMs6umhSWq9wxCRMqNR4RT+CIUV7K6Iq6sfaRV4iHP5EXLadNON1GPRGNQXJdEUhjIpPRh4hNXSfUuGdzPFQV/Rj0RjcI70lb3ItCtFpXW8xSxGV6ITBvj6zwoxavaPRFZSUdPkVjXutF8OJidS71JbGHK2r+QzeOfDQlTvvMBaJxnLjMzLYiTIx0pb29xLqXLiMOcwuSmY7oGZ6MMt9Q2Y02S2WPEmFrk5jGmWZYjowtcgxkuaJVcJzEPtKOQbuu1CFwlzK5yjZUsRiVws2VbsQ5cirXbryNRDUQvmtvKSk1qmIu+j68ilTTT/jNY6agnUuSnlXb9CF6Or3/QxN3ZYbTJlbhkIfqrJXdhOeum5biZaK3Hj3FIq7sjUjPTlxlqkUklJ3T1fBlastMq3L5ikuL3Is/CPqkk09SBOV3cy4aH2nuRlper6EFHi9WaperPM2zGakgAIMqkAg1+DLS0u+RjuXnokupjZqfCBK+46OCwEt8/RTW7izdw+GhSXorXm95lcjGWsIpUowVoq31MsdxhciYN8CLMJqPUrBvgUm3fUtTlvH6fhUbvqKe8irIpHeP0/GxU1XaazZabsyK0ko5hJHQnfQw1aqhvevJbzUqY1/Z07eJR+mr/AGuPaEmzYqYhyjmhpb1lx95qqbve+pFObi7rfy5mTERSaa0vvjyL/bnmfJJxTWaPxXL/AGMZMJuLui8oprNH4rl/sT1PGdRVkrcDApWehZOWW99DFctmaw3I3cbxMGfgzNSkkk29xiUXUm7fF8hPiQ2KMdLr4spN5p6EJNaJ2S3donJRWX7T3vl2CY6Zx2lJrdquzcXunoYYyy7t5khJWu1rzRzJhlc7WSIk97+BiirvRicmtH8ys4WzFlIx5l2k6czntMMsZu5aczDHfwLT/wCaliJwmGRy0QnNmOT9FCUlzRqI6DMWcjFdcxmViRBhOYtmtqYsze4lPg2IhrC09NVuKPt0LRnw6GJxfErUQlz5Ebt5VzS3BRb1e4NRCXLNot/1L5smj1/QxSit8S8ZKej0lwfM3C4Y6i471zKXL6wdn05hxvrHpyLjPgo2W9X3/QN5ff8AQx3HipZMJ2ZbSK11b4FVHN6u/kXGJF401J+j0ZWs7ejyJm8qst/EoqnBq5Z7EQi5OxlxE0koLct5kUFCOnrPhxRpt8yYwsTlJVk3KsS0kEAgm5amtb8iiLzdlbqajwVcrsgAfivQyZFyrZVsw6LXLwkrGK5ek9bFhJ8RUldlbkT3srczKwtmKp6kXIlUyW5iEmcGJxGXheXBGoqji3Kbu3wIxlb0tOpqXuaxhyzmFm9TJRi27rRLe3uMtLCpLNVdlwXFlcTfS3qcLbi4TOeoZJSVm4b+L7jWinJ2WrZfDwlJ6aJb3wRsTqKzVPfxfP3BnOOoYqmHaV007b0uBijJp3TEJuLunqZsqqax0lxjz9xP8O49Wi1KLS0k+HP3GOMbavobEYKK1Rgz53aW/g1+pZhIRFOcrIyVZqKyR+L5kzkqayxd297/AEMdGF7yl6q+b5GV/tmoSyLM9W9y/UxvLJ8mI1XKTbML3lnxIjtkcWi3jLpJmONRriWzxe9dDDS8d+jLSnJMrTguDFXMjWOmJjtPjOaX0JbVrrTXmYvGF5SVo6cLmTayUlfjxJmlz+RSM0op9rKTqFxEQmMs7tl3/Ix3XN9CiqLLYjMiLtXcl29SfGckjFnXIt412LBtX1TuveJpJ7+1FFeS04Fsqy6vVcuRcGFJT10GVvf8yPGJbl8SkptmW2S6ju1ZSU295S4uUWjKxaS4oxl4aavcBlhJTWWW/gzFKLiyam70d3zLUp5vRl8HyNDCy/q6vfy5FqsfF9r58DCk2+bLPR6at9pk0gv730DagtNZfQwtmWl3O/rdeJnoUdHPfbcjA6UrXtoVp1ZRd07Gv9THwnNt3e8nxl/WXxNjNCrv9GXPgzXrUJQeu7mXuDpDhxWqKBNovmT39R1KsYLShbtIirkwq0FZXfwKNkyl0Ks1OMEBMUVMkdESIWXbbIJcbbyN5na3lDZME3qY2zJTlwJBPhNX96+ZjhFsyTsnds1cViHKDdN7vWtvLjLO7CcRi409I6y+SOdOo5O7d2Yzdo4OyzVXljy4ssQza31hpUZVHZdeCNnNCj6vpT58vcYq+M0ywWWPYayTb5sMYmfV6lVyd27m1ho2jeppB8Hx9xWNONJZp6y4R7zXrV3N3fTkXz1Peo8bOKk2lk0p8l+pqxlYtRrOPauKMk6Sks0PiiT21EY6LqfYyVHJq9/A19xOYyrfVRzVpvXg1+pSVPxSu9W9z4FqMko597S3GssRJNvffenuNzhyiJ/FXIzOanGMb5bcODK2hPc8kuT3P4lKlNx3qxnxrqUyi470VuTCo1u3cnuLZovenF9mq6EwqgLODuktXa9kVJgXgZJ6R95ipvUvXnrbkX8T9WpaJtq5NWor2tu0JitIrmzXlK92Pw/WxOUcsNObKzlForXfqrlFGK4kiGVOPJjPHkYgRWXxq4JE58yZhL0nqBkw0vStz0Kp5Za+5mO9n7jNiVdqS4osIxTVm0VuXl6UU1vWjK5ebsJqQi5KT/3GZLcr+8rKTe8eKvmS3av5FXK+8hK7sXbUdN7+hfRNPTV6ItN2Xo7mYW2zLTWVPN0LE/BMKqtllu+hWU0tI9TFIiKbdkSVwXM9OmorNP4IlRjT1esuRr1Kjk7sHrJLESbvw5Cynu0fIwi4XCWmt5no4pxVpelHkyiqKWkupSpTcfdzKe+tmeHjNXpv3xNWUWnZiM3F3TszajXjU0qLX2i+p3DUUrEud0Za+FcdVquZrjMw1GJSQwErj8VaCEmG+BfD0sz13LeXzpHbvmXajDdpnnVtystfQ6PvLPb1a97Q/K+88/LDtsl6V07rNuEZpRujzXn6vf7HR95jqbZrS9ldiWhrlqzNLOtj8ReVlovqYMM5KSyq5yZY6b326GWG1qsVZZV8By1SdO2HenOlSd1G8+XBGjWrym7tnKeOm+XQeWz7OgnWqkaMw6tGlKbskbMqkaKtHWfF8jkLa9VRypQS7E+8w+XS5R+feOasJOlaffHRlNt3buxc5vl0+UejHls+zoTlq3xy6RaE2ndM5fl0+zoPLp9nQnLU45dvNGpv0lz4MxVKbi9Tk+XT7OhlW1atrei12ovLVnis7EXa1ijcZb9HzOR5yqdnQr5fPs6CdaspGlZ1p02vcWp15R03rk9UcmO06i3W6ES2jN8I9CctV4pdq0JRcknFr4owHN85VMuX0bXvuK+Xz7OgnVqkaVnVTLqs+Ovv1OQtoT5R6MjzhPlHoOWF4pdyk4t3y2tyZWyk9JdUcdbSqJW9HoRHaNRO/o9By1Ths79XR71pF8eJqnLntKpJ3eXoV84T7OgnVqRpWdvEJ5no9yMVuxnL851ea+ZPnSrz+pZ1ayRpWdOz5EqL5M5fnSrzXzI851Oa+ZOShxWdbxcuRKjZ70vicfzhP+70I84T/u9By1OKzt1VFPf0ReMlKGVcNxw3tKo/Z6CG06kd2XoOWpxWdS5By/OM+UejLedKnDKvchy1Xis6ni3x095Po7tfech7QqdnQjy+fZ0Ly1OKzqveZvF5ld6NfM462pU5R6Eec6l76dBy1OKzreMS9XqY3K5y3tCf93oFtCaf2ehOWpxS69Ok5di5l5VYwVodTkT2rVlvy27EY/Lp9nQvLU4rOm3cg5vl0+zoPLp9nQnLVeOXSIOd5dPs6Dy2fZ0HLVeOXRMlOs1pvXI5Xls+zoPLZ9nQvLU45deVJNXh0Me73nOhtCpF3VugqbRnJ3aj0LzVTjs6lHEyi+a5Gw6cKvq+jLkcHy2fZ0Cx01y6Dmqk6U/jqVKUouzRG40/PFVqzyv3owPHTfLoXmosadv104QcnZGevNRWSPxZyKe0qkb2y69hR46fZ0JzVOOzWAB5HoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH/9k=",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('e3tGQykFC5M', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "79cfdd4f-a214-4c13-aba8-c40067c1faf5",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f18e5709-a21d-4fc0-913b-fa1ad90beec1",
+ "metadata": {},
+ "source": [
+ "By the end of this module, you will be able to: \n",
+ "- Understand decision trees\n",
+ " - Define what a decision tree is and explain its role as a supervised machine learning model. \n",
+ "- Prepare and explore Data \n",
+ " - Load, inspect, and preprocess datasets using Python libraries. \n",
+ " - Separate data into features and labels, understanding the significance of each. \n",
+ "- Train and visualize models\n",
+ " - Create and train a decision tree model using scikit-learn library. \n",
+ " - Visualize the structure of a trained Decision Tree and interpret its decision-making process. \n",
+ "- Make predictions and evaluate performance \n",
+ " - Use the trained decision tree model to make predictions on new data. \n",
+ " - Compare predicted values to actual values using visual tools and calculate metrics such as Root Mean Square Error (RMSE) to assess model accuracy. \n",
+ "- Apply to Real-World Data\n",
+ " - Implement decision trees on real-world datasets, such as COVID-19 data from California. \n",
+ " - Understand the pratical implications and limitations of using decision trees for predictive modeling. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "17ef06ff-a9e9-4d29-8bd5-dc7c3fbf994f",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhwaGRoeHRofIjAlIiIiIzAlJScyLikyMC0tLS81PFBCNzhLPSstRWFFS1NWW11bNUFlbWRYbFBZW1cBERISGBYZLRoaL1c2LTlXV1hXV1dXXldXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV1dXXVdXV1ddV1dXV1dXXf/AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAgMBBAYHBf/EAEoQAAIBAgIDCQsLAgYCAwEAAAABAgMRBCESMVEFE0FSYXGRktEUFSIyNFNzgaGx0gYHFhcjM0JyssHwYuE1Q1SCk6Ik8WOD4kT/xAAZAQEBAQEBAQAAAAAAAAAAAAAAAQIDBAX/xAAgEQEBAQEAAQQDAQAAAAAAAAAAEQECEgMhMUETImFR/9oADAMBAAIRAxEAPwDz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAF/cr2x6R3K9sekFUAv7le2PSO5Xtj0gqgF/cr2x6R3K9sekFUAv7le2PSO5Xtj0gqgF/cr2x6R3K9sekFUAvWEk3ZNNvUi3vXW4kurLsEK0wbneutxJdWXYO9dbiS6suwFaYNzvXW4kurLsHeutxJdWXYCtMG53rrcSXVl2DvXW4kurLsBWmDc711uJLqy7B3rrcSXVl2CJWmDc711uJLqy7B3rrcSXVl2CLWmDc711uJLqy7B3rrcSXVl2CFaYNzvXW4kurLsHeutxJdWXYIVpg3O9dbiS6suwd663El1ZdgStMG53rrcSXVl2DvXW4kurLsBWmDc711uJLqy7B3rrcSXVl2ArTBud663El1Zdg711uJLqy7AtaYNzvXW83Lqy7B3rrebl1ZdgStMG53rrebl1Zdg711vNy6suwFaYNzvXW83Lqy7B3rrebl1ZdgK0wbneut5uXVl2DvXW83Lqy7AVpg3O9dbzcurLsHeut5uXVl2ArTBud663m5dWXYO9dbzcurLsC1pg3u8+I81U6k+wd58R5qp/xz7AlaIN3vTX81PqS7BLcqurXpzV3ZXhJXexZawVpA3u9GI81U6k+wd6MR5qp1JdgK0Qb3ejEeaqdSfYQq7m1YK84uCeScoyivagtagL+5Xtj0juV7Y9IKoBf3K9sekdyvbHpBVAL+5Xtj0juV7Y9IKoAAAAAAAAAAAAAb24fluG9PT/AFo9llJ5Wz9djxvcJf8AnYX09P8AWj2icIXUW83mkybZ7Ch10rXvmr7UO6Y7X0M2XTiuGxiroQV5NpdPuLgo39Wvna1/bYx3THa+hmyoRsmnk9RCM4NXUsr2vcqKXiEs87bbMd0KyeduZ/zhLlKD1SXShpQ4y6UBXOqkk883YzOrZNu+TsTU4cZbNaMaUOMulE0Vd0KyedmY7pjtfQy5Tg7Wks9WaMqUON7UMFLxCT4TCxKfBLoL1KHGXSgpQ43tRRVCsnt12I91R/q6GXtwvr9phzgnZvPnAhKqlFyzsunoLYtaLk72RHfIcZdJKNeCyuAjVg1dXtexlVYNXzy5GFiobVt1me6I7V0ky/aoutBXvdMzGrB6r58jRlYmO1dJjuqG1dIEd/ha+jLoM79DZIk8THatdtYeIiuFdIEsrXs9VyDacVJXz25Eo14vVYhOtB5NrpAqhWUnJK94uzya4Lk7jShlnr1co0oXtfPnKhcXJqKIzcYq7bQEW3tM3JNRWtmLxtpXy2gRu9ole2RZTUZK6d0QhOnKegm9KzdrNanZ2ds8wMRvbPWYbd9eRRHdGi628py0r2vbwW1rVzdnTVnr1Mambm/DXhjE4uSzSLI4pPh4LmrufH7BHyp1Lpu9nOcoq7skk7N9B4+vU6zMXya3yk3enCMKNHyirK6fFjeyfO/2PiY/cHEaO/SrVJz1tuTbT2ouowlit13ZLwF4GlqsoK3tZ2U1V0N7tDSaavZ6N7dNj0+lx+t08nJfJH5S13Wjhq7lUi9UnnKHO+Fc53E3ot8p5jupTq4DHJw0d88ZKK8F3y0bbGelRrxrUoSX4oqXSrnP1M8fdrNqdKfi8qb9xyfzlTvhKa2Vl+iR0lOp4TvwQa9Zyfziu1Cmr3vOL/6zRjju7kNefgA9KAAAAAAAAAAAAAAAAAAA39wfLsL6en+tHtjpxclJx8JameJ7geX4X09P9aPZMRTq6UnG7WVkpaPOud7TPXXjlXMrakr60RqU1K10/U2n7DXSaoNVm1e6dndpN5K/M9ZG1Pe4y05aPi34cr8nOXNvum/LbayKqeFhGOjGKUb3stV3mUKnT3pNzm4Phu88mu0zvdJqU1OVm87P+3KUXdyx2B4WD1xNejSoytGFSba5WzEt4bbc/CbzsunLOwF6wcLW0VYRwVNaopMppYejo6UZSai87N/+yVCFJyWhK8ksr55cOz2gWPBQvfR1K3IFgoXvo5lMI0ajSU3pWtldZW1P1Et5pp23yd/zf25ALu5IcVDuWPFRToUmlFVJeNfXm75EqUaalGSnJu3K76wLe542tYPDp61fn/nII1YaWt3l0bC8DXWFjxUZ7mjsLwBQ8NHiodzR2F4A1+5Y8VBYWPFRsACjuaPFXQO5Y7EXgCjuaOxGFhI2too2ABrrCx2GVhY8VF4ArVO2ojOipK0ldcpnExm6clTajO3gt6k+hlGHjiFTSnKDqXd3wWv4OpLg5EBdKimrPUHRWjo2ytaxFqtwOn0PtJ1tPQloJOpbK7sr8vIBmnTUVZLIhDDxU9NJ6Vmr3b1u7yvtIx37RjfQUs9JZtcli2lp2eno34NG9tXKBpR3JpKtvtnfScrX8FN62b09rWpP+ew1cOsRvM98cd9z0dHVqy9t/VY+TuPTxCqT0980NB6WnfxrcHtNe+44Xn0+s5zPlfh8UtB21a17z4irOp4KWVrdGb9Z86jurVVCFqWWgs9L+lP3MhTx9aENKFOF8nF6XJrfWt6j5/h3uSNdXdjGHx8MHjKVaSavffOZpWkl/NR2+/Rtvu+pw1pHmeJwmJm3viUmlZPSVkln0Zo+tD5QTw2GjSnRTqaNotu6tmrvoPf6Xtk1rM3Mae62Mjit0lJ1N7p6cY75e2ik82nwZ8J3zqQhvdGm1mlaz1R29B5TON4tvXrNnc7detQknCbairJPNW2ciOPrc738NY9YhFZtauw4n5wb71T5Kn7SNjc/dluN03aebT4HwnxvldinUpwvq0r+x9pyzrN6zMTyzXKgA9TQAAAAAAAAAAAAAAAAAAPobgeX4X09P9aPbdJXtfPYeJbgeX4X09P9aPa97SblbN8ICcrX1vK+ohTqqTtaS4c42RYlm82VzU7O1r3yu8rcHBrAzOrorVJq9skTg7q+a5GUNVv6Olh79n4nSwNjRFjXkq22F9mexW/cnRjU/Hb/AGtgW2CiHHlY0eVgNEi6Udi6ES0eVjR5WBHeY7F0IbzHYuhEtHlY0eVgYVKOxdCJW5TGjysaPKwM25RblMaPKxo8rAzblFuUxo8rGjysDNuUW5TGjysaPKwM25RblMaPKxo8rAzblFuUxo8rGjysDNuUW5TGjysaPKwM25RblMaPKxo8rApliEuCfqjcnGpdXtLoz6A5eFa+Vtd0QdTJvO99Wks+UC2ckk23ZLNsq7oyT0Z5q9rZkJVpKcUleLWctJZa+Do6TMq9lJ55OyzzfLbYBbCrpX1q21WFV5SWfit3tkUyrtOmrNqd9J3yjle75ydaXgTs9UXw8gHhyxU7JXWXIuwjKtJu7fsRWAJabG+PaRAFjrzatfIiqj2kQBs0sfVgrRm0vURr4ypUVpycks+AoBPHLYkAAVQAAAAAAAAAAAAAAAAAAfQ3A8vwvp6f60e172k5Su89a4DxTcDy/C+np/rR7Uoy0m3LwdlgMuaTd2lq1lc1StaUlrvnLhJ1FtjpL1fuVuEXrpfpAreHo28a3A3pZ35egk6NGTtdXWVtLZ/PYTcU73pa3d6u0aKvfes9uXaBX3PRs/CTX5tXN7RvFF8PBxv5sZPQj5n2R4P/AGNCPmtX5QKo0KMbrT15Zyvs7DLoUUr6WW3S/m1Fm9xvfec9vgmN6j5n3dvIgKU8Pm9NZq2szBYe91PNW/E3bUk7dBZvEPMLoiSjTitVFL1R5/2QFFGnQnL7OV3bUpP+cJesFC3D0inCMHeNHRfJZfuWb4+JLpXaBCeDhLXpdNuBL9kYeChnrz2Nlm+PiS6V2jfHxJdK7QIPBwe3VbW/5wiWCptttO72OxPfHxJdK7Rvj4kuldoE4QUVZEirfHxJdK7Rvj4kuldoFoKt8fEl0rtG+PiPpXaBaCrfXxH0rtG+viPpXaBaCrfXxH0rtG+viPpXaBaCrfHxJdK7Rvj4kvZ2gWgq318SXs7Rvr4kvZ2gWWFlsK99fEl7O0b6+JL2doFlhZbCvfXxJeztG+viS9naBZZFeIX2c/yv3DfHxH7O0hWm3CS0H4r2bOcDwgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtKlLSaa8Hgf8AGeLbgeX4X09P9aPbnqAor4nRhpaLeeja1m3ey17XZes1++as2oS/p/qyWj06SN6Uktbtd29b1FKxtLzkemz6AK1jfBlLRVk0l4Wu8dJXustaJQxkW6at499WpbL89mSWMpWupxe22dr5Z2JQxNOXizi9SyafMBqR3TvbwFqv43Ins5c9hKW6STtor1SvwJ5ZG061PjR6eWxjumnx49KA1obpJtJwyds07paTSSex5/y5ivui6bd6astLVJ/ht/Tl4xtd0U+NHpDxNPjx6QKIY1ybSgna2qT4basuU3SnuimvxRyy1oz3RC19ONucC0FPdNPjrp2ax3TT46y15gXAqWIg9U49Jnf4cZZ8v82gWAp7qp8ePSHiaa1zj0gXAp7qp8ePSZeIhn4Sy15gWmCrumnx49Jl4iHHj0gWAq7pp2vpxtzme6IcePSBYCt4iC1zivWO6IXtpK97dOoCwFfdEOPHpMLFU2r6cekC4GE7q6MgAAAAAAhW8SXM/cTIVvFlzP3AeBgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+huB5fhfT0/wBaPbnqPEdwPL8L6en+tHtz1AQrXtlDTd9WS9eZrb3rXc8NX9O21tRulb07ZaN9L2X94GtaSvbDxz15xMqMksqEE2881wPX7yVqy/FDk5yVqtneUVssuXh9VwKpRdvJ4PZmtt9n8uS0W1nQjzXXYTg6nDKm808r6uEjavwSp29ewDFpWadCNuDNWeez2kfC/wBPHpiWVFVtlKCd/U1bhy13uVR7pd05UU9iTb19gGzCjHW4RUms8l6zKowz8GOfJsNeEcRfw509FcVO+rhvwE2q3BKn0MC1UIL8EehDeIcSOfIip79d2lTtwXuPtuF09XLr4ALlRjxY9CMKjBaoxy5EVPfbeNT9ttfYPtrK0qfDfXyW/cC3eIXvoRvzIbxDiR6EVQVbhlDVwbf5YxHfdK2nTyTulr1Zeq4F28Q4kehGd6jn4Kz15aym1a2unfPgfqGlUTznD3bLfzlAtVGHEj0IKhBK2jG1raiiE6t7adN+97cuknarneUErNLneoCzeIWtoRtzIbxDiR6EV/a2ylB58urtI2r210+hgXOjB64xfqRneo69GN1qyRX9pfNwtb2291yM99t40E7+q1gLVRgtUYr1IbzDix6EVPfeNT1Z6zM1V/DKF7cKy1cFgL0jJhas9ZkAAAAAAEK3iy5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gMlOKS0fCi5q+pe8uKpVJWuoNu9rXSutoGtTVO6tSmtFK2T4NXCVwhSurUp+tPK/rNqdapwU7/wC63CI1pu/2bWzPXmBRvdJxzpTyy1Zrh26szCVJtN0pq/g5p8Pg7dhsurU8K1PUrq8tbtqKZYuslfueT5pICuCpa1Rnmtj4ctvIJwpvXSqZbFr4Npb3TW8w9fHVydOtVabdLRd8lpK+rNgUyp07NbzPK+rhvLnM6NN6UnSnw3y13eeVy6FabdnTaW26Y3+pb7p6+MgNZxpeaneztk87Ln5DOjTsvsqng5cN/fnrNmVWd3anln+JZ7DCrT80+stgFChTjkqU7a9T7eUxamn91O715PbfaXuvU4KT9cltMyr1OCk3lfxl0AUb3Sye9Tz1ZN29uWsxWpUpq8qU2lz8F0nZPM2VVndfZvVm9JZZfxGN+qeatzyWy/vyA0+5aFrbzO2lqs9dtevkJyhSejejPJZZPKyNl15+ab/3LaFVqW+7/wCwFUdz6Mk5KDi5J3zafhKzMPcmhxPa+H1l8qs8rU75cZZchiFebavSaW26e3+wGZYOm7XjqVlm8kZ7khe+jnq1vZYiq1Sye9O+zSWRjuipe28vrIA8DSf4fayXckLt6Ou983w6zG/VL/dPn0kJVqi1Ur/7kBlYWHF2rpv2slSoRhfRVr6yDrVPNf8AYOtU81/2QF5kqozlK+lDR9dy0AAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/AFo9ueoDJTiWtHObguMidSejbJu7SyV9Zq93xazp1OZwAjeNr90Ste2seDr7olsefCSWMg07Up5Z5wtraWXSSjiYNP7OSUeBwtwPVt1AQlo6V9/kr6lfLX/EWrDSs06s3fhyy6Bv8MvAeq/i8/RqHda4s+gDCwjV7Vaivy395l4WWVqs1678N/3MxxcXbwZZu3i83aJYtL8M7fl5Wv2ALDvK9Sbty6+cx3K/OT6TPdSvbRn1f5sMd1rizvst++oB3K7332fSZ7nd775LXqEcUnfwZZX4NmzpEcWm7aM1/t2gYeHlZ/ayz9nMO5pedlazVuflM91q19GfVtt7B3WrX0Z8nggI4Zq/2k3zvVlYj3LLz0/YSWKVr6Mlle1uVL9wsWuLO1r3sBl0Hn9pLN9HMYeGfnJ9JmOKTV1GWu2oxDFp28GavtiAWGeik6km0734dVhLDyf+bNZLZwGFjE/wz2+KI4yL1Rn1QMxw8k098lk9T9zI9xvztRrgVyTxSsnozz/pMRxibS0Z5uy8H+WAzLDyf+ZJZW/uYeGlwVJmY4pP8Ml6hLFJa4ytyK/BcBGhK6bqS5uAx3K8/tZ3ate/ttqJSxNvwyte10ufsMTxSWuMujm7QLzJCFTSbVnltRMAAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gIVlK3gyUXy8xSlVaf2kM1k0tXKVbtY+jhqO+V4ylBSStFXd3q4T4K+V+5/mqud/wLhVn+IDpLVM/tI5rZqdrZevMR3yz8OF3aztl0HNfS7c/zVXh/AuHX+LkMv5YYC997rareKtXWKOjW+PVUhda8h9pb7yGvZyHOr5Y4Bf5dbWn4q4NX4iK+Vu561Uq3V//AEB0l6jirVIatdrp/wAVhLTu/tYrPJZcuXu6DnH8r9z2rOnWtnlorh1/iH0u3P8AN1uqviA6TRqvVOPR0jQrcePQc/T+W2CirRhWS/IviJfTvB8Wt1F8QHQShU0m1NWvqaI6NbjR6Og+D9OsHxa3UXxD6dYPi1uoviIOgjGrwyjrXBwXz/YxCFVPOcWubM+B9OsHxa3UXxD6dYPi1uoviA+7oV+PDqmdCtbx43vrtwZ+3UfB+nWD4tbqL4h9OsHxa3UXxAfecK1/HjbP8PQZcat3aUbcGWZ8D6dYPi1uoviH06wfFrdRfEB99Rq5+HG/BkFCrn4cb3y8HgPgfTrB8Wt1F8Q+nWD4tbqr4gPvONbK046lfLpMuFXK01y3XLn7D4H06wfFrdVfEPp1g+LW6i+ID7uhW48ej3k1Crl4a155HP8A06wfFrdRfEPp3g+LW6i+ID7yp1rfeLq8j1+uxbRUkvDab2pWOc+neD4tbqL4h9O8Hxa3UXxAdOD5W43ygo41zVFTTgk3pJLXe1s3sZ9UAAAAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/Wj256gOc+Xv8Ah79JD3nnB6R8vPIH6SHvPNy4gACoAAAAABfgcLv1TQ0lHwZScmr2UYuTy5kUG1ubiI0qulO+i4Ti9FXa04Sjeza2gSnubNqDpPfo1G1Fxi07xtpJxerWnzEIbnV5SlBUajlDxlou6vqN/C7rU6MVRjGTpOM1KUoxcnKejnoO6stBK187sxX3RpVae9VHU0YzjOEoQhF2jDR0dFO0VsedijTludV1whOaUIzbUWraSvwmXuZVc3GnCdS0YydoNW0oppZ8/rN1bq0ZJKpGcoKnCLpuMXFyhT0dJSveLvwrg4DNbdSjVSjNVIqM6dROKi23CnGDTu1bxbp8uog+bHAVnTdRUpuCveWi7ZOz6LO5ZgcFGspt1lBwhKck4OXgxtd3XPqN2W7UZVaVSUJLR39yStb7ZzatzaSvzHz8DiFTVZNN75QnTVuBytZvkyAnLcutdaNKcoyzhLQa0la6dtlsyENzq8pSiqNRyh4y0XdchvU91oKpJuL0ZYaFFtxjNpwUc9GWTV46uUuju3FrRk5eDUjOE94pSeUFFLRbtG2irNMK+ECU5Xk29bbb9ZEIAAAAAAAA7D5ufvcV+Wn75ndnCfNz97ivy0/fM7smtAAIAAAEK3iS5n7iZCt4kuZ+4DwMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH0NwPL8L6en+tHtz1HiO4Hl+F9PT/AFo9ueoDnPl55A/SR955wej/AC9/w9+kj7zzguJoACoAAAAAAAAA3dyKcZYiCkk1aWT1ZRbPr7zDiQ6qDn16mc7Nc2DraG5e+RvCEG29Wilqtw2twko7jt5b3C/CrRLE/Ln+OQB1tTcpxi5OlCyV9Uf5wFNbCRg0nCGaT8VcKvsEPy45gHSww9N3ThHxZfhXBFs5pEa47zr4AAGwAAAAAAAHYfNz97ify0/fM7s4T5ufvcV+Wn75ndk1cAARQAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD6G4Hl+F9PT/Wj256jxHcDy/C+np/rR7c9QHOfL3/AA9+kh7zzg9H+Xv+Hv0kfeecGsTQABAAAACWS4OAsEQZutntCa2e0QX4HE71VjO17Xy501+5v99ocWXsNfHYqlOnCMIWktf86DSus8vaTna59cZ18vrLdmOye3X/AHJR3Yi3+Nev+58e62e0RklbI0n4uX1nuzHhU+ldoe7EHrjN9HafJussvaYutgPxcvrrdiCvaEr2a4OFNfufHLsPUhGac4aUeFXtfLaQlJXbtr1chPuO3PpZzx5ZqAJXWWXtMJrZ7REYBl2t0mCAAAAAA7D5ufvcV+Wn75ndnCfNz97ify0/fM7smrgACKAAAQreJLmfuJkK3iS5n7gPAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAfQ3A8vwvp6f60e3PUeI7geX4X09P9aPbnqA535eeQP0kfeebno/y88gfpI+884NYmgACPvUMHS3mi3CLcqak29rLaWApydlCC1vNZZK79xqUd06apU4NSvCCi7K97essjurTi7pzi1yWfvK8vXnW69xbNJ0Y5uydlbXYPcfJfYroWy/7mr39XnKv8d9pl7uJW+0q5r+cPIU/b+rXuXBad6cE4JNprbq95V3HS83HoIPdqGa0qmdr5a7ar5kO+lL+voXaE3z/rU3YoRpzhoJJOF3bbdr9jQNzdPFxrSg4p2jDRz52/3NMy9PFmUAAaAAAPtbn4WnLDxlKCcnKV2+S1j4p9LB7pRp0o05Rlk27q3Dbl5C4x6lns+jSwFOUoxVOF5NJZbWXy3EstJUoOPGyS1X4c9R87vtBWejUXCskv3Jd/Ftq7df9yuGef3X0XuHa63qLs8tW1r9iC3IjpaLpwT0XJXs7pc3DkzS79q171OnP38pHv3H/wCTVbWtWzWF/f8Aq/uSl5uPQaO6+HhCNJxiouWle3Jo295b32p8Wfs7TV3Rx0a0YRjFrQvr4b27CLx5+Xu0QAR6HYfNz97ify0/fM7s4T5ufvcT+Wn75ndk1cAARQAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD6G4Hl+F9PT/AFo9ueo8R3A8vwvp6f60e3PUBzny88gfpIe884PR/l7/AIe/SQ95zPyc3JoV6E51YuUlU0V4TWWinwc5rE1zxmKv7zsO8mCu1oO62zmllrs75hbi4F8C/wCSXaByeHqRhNSs3Zl+6OLjWqOSjo8isjpo7i4J6op3yyqPtJQ3AwctVN5O3jS4NfCT2tSONy5RdbHq4NpfulRjTxFWnHxYTlFcydka8Vdo1RZQqqE4y0VK2tSzTyI1JRlJu1ru9lqS2I7Wp8ncJFtb28v65dpr96MDZtwaS2ymnqvqvsJ9108t8fFyDtZ69Zg7F7kYFcC/5Jc20zHcXBS8WOk7Xspybt0luMRxoO0huDhJJSVN2auvClw+s5XdOhGliKtOPiwm4rhyTINUA7vE7k4Sm7dzxedkv7t8gRwhNNK2VzsFhcE7WoxzV9Tyvkr+szLCYLzMcsvFlwZirHN47dCNWMFoW0VbZ/P7GimssufPWdksDg3FSVCLTko3s1r5+DM+f8ptz6NKhTlTpqEnU0XbZotk59vZJHO6SyyyVzCa2e0wdJ8mNz6NahUlVpqclU0U3wLRT/c1R8KWITp6GhFeFfS/FzcxVk75WOxWCwVm3QistLU7tWvdWJvA4Jf5UVfLxZfzgfQyZHTvrepfpxIO1jgcE9VOGrS1NZbSnHbm4Z4OrVp0oxap6UXwrNcoYjPzc/e4n8tP3zO7OE+bn73Fflp++Z3ZNUABAAAAhW8SXM/cTIVvElzP3AeBgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+huB5fhfT0/1o9ueo8R3A8vwvp6f60e3PUBzny9/w9+kh7zmPk9uxRw9CVOq2pOppK0W1bRS4OY6j5eeQP0kfeeblxNddLdvAttuUs/6Z+u2y/DtMd+cDxpbfFn/ADZ0LYckCldct28DdO7utT0J8+zlLKfyhwcb2nPN3fgTeb50caBCtndGvGriKtSN9Gc5SV8nZu6NdPNMwAjsp/KnDSbdqmf9K7Sjv7g+JU6vJbbsOUAWur7/AGD4lTq/3M9/8Hn4NTNWfg610nJgQrsF8pcMlZKpl/T/AHOY3RxCq4irUimozm5K+uzZrAAdPU+VkZX0sPe+taat7jmAEdK/lRB68Nr/AKl8I+lMP9P/ANl8JzQItdL9KYf6fl8dfCaO7O7axVOEFT0NGeldyvwWtq5T5AKUPtbhbtQw1KcJwm9KeknG3FStm1sPigI6j6R4XzE+rDh18JlfKTDXvvE7/lh2nLAi11S+U2HTuqVRPV4se0pxvyiozw9SlCnUTnHRV1FJZrYzmwCuw+bn73Fflp++Z3Zwnzc/e4r8tP3zO7GgACKAAAQreJLmfuJkK3iS5n7gPAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb+4Hl+F9PT/Wj296jxDcDy7C+np/rR7e9QHOfLzyB+kh7zzg9H+XvkD9JD3nnBcTQAFQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdh83P3uK/LT98zuzhPm5+9xX5afvmd2TWgAEAAACFbxJcz9xMhW8SXM/cB4GAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADf3A8uwvp6f60e3vUeD4PESo1adWKTlTmppPVeLur8mR1P1jY3zWH6s/jA9Hx9OlKFq8IzhdZSjpK/BkaXe/AZf+PQz1fZLsOF+sbG+aw/Vn8Y+sbGeaw3Vn8YHddw7n5/YUMtf2S7OUytz8A9VChrt90uHVwchwn1i4zzWG6k/jH1jYzzWG6k/jA7lYLc/wD09Dl+yWXsM978Da6w1Fr0S4fVyHCfWLjPNYbqT+Mz9Y2M81hurP4wO6WB3Pa0t4oW270uTk5UYlgcAteHof8AEuw4b6xcZ5rDdSfxh/OLjH/lYbqT+MDu3udgFf8A8ehlr+yXYYWBwD1Yei//AKl2HC/WLjPNYbqT+MfWLjPNYbqT+MDuo4HAO/8A49HJXf2Sv7jHcW5/mKH/ABLsOG+sbGeaw3Un8Y+sXGeaw3Un8YHcVMJudHKVHDrK/wB0uTk5UYlhtzla9Ghnq+yWedthw6+cTGeZw3Un8Y+sTGa95w1/yT+MDuZYPAKeg8NR0s/8pcHq5CEaG5zv9hRSWu9JbUtnKcV9Y2M81hupP4zD+cTGeZw3Un8YHbbzubl9jQz1fZLsLnudgFrw9DXb7pcHqOD+sTGXvvOGv+SfxmfrGxnmsN1J/GB3TwO59r7xQte33S2X2GO4dz7N9z0LLX9ksvYcN9YuM81hupP4x9YuM81hupP4wO67gwFr9z0LXt90uD1CW5+AWvD0ddvulweo4X6xcZ5rDdSfxj6xsZ5rDdSfxgd08DgF/wDz0P8AiXYTp7mYGbajh6Dazf2Uew4L6xcZ5rDdSfxj6xsZ5rDdSfxgeg95MJ/paH/HHsHeTCf6Wh/xx7Dz76x8b5rD9Wfxj6x8b5rD9Wfxgej4XAUaLbpUqdNy16EVG9tV7c5snl/1j43zWH6s/jH1j43zWH6s/jA9QB5f9Y+N81h+rP4x9Y+N81h+rP4wPUAeX/WPjfNYfqz+MfWPjfNYfqz+MD1AhW8SXM/ceZfWPjfNYfqz+MxL5xcY01vWHzVvFn8YHIgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//Z",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('_kAjJ8rJwfU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "faaa7b6b-4339-4f7f-ac74-ce3c67306fe9",
+ "metadata": {},
+ "source": [
+ "We are going to be working with **COVID data** from the 58 counties of California during Summer 2020 (July, August, and September). \n",
+ "\n",
+ "**Remember the complete dataset with 58 counties from the previous video of this workshop?** \n",
+ "\n",
+ "Let's now imagine that we did not know the **cases per 100,000 people** for the last 18 counties of the dataset. \n",
+ "\n",
+ "![Features-for-Prediction.jpg](images/Features-for-Prediction.jpg)\n",
+ "\n",
+ "**The objective** of this exercise will be to make predictions for these missing values in the column **cases per 100,000 people** based solely on the data that we do have available.\n",
+ "\n",
+ "The information that we still have available for these 18 counties are:\n",
+ "\n",
+ "* Population\n",
+ "* Vaccination Percentage (Partial and Fully vaccinated)\n",
+ "* Unemployment Rates\n",
+ "* Partisan Voting Percentage (Democrat, Green, Republican, Libertarian, and Other)\n",
+ "\n",
+ "In order to do this, we will be creating a **DECISION TREE**"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "347373f9-cd3b-409e-87f6-2237f7ab6a5c",
+ "metadata": {},
+ "source": [
+ "## WHAT IS A DECISION TREE?\n",
+ "\n",
+ " A **Decision Tree** is a supervised machine learning model that allows us to make predictions by learning simple decision rules that are inferred using available information in the dataset. \n",
+ " \n",
+ "- A Decision Tree is called a **supervised** model because we know exactly what we want to figure out. For example, for our Decision Tree, we will specify that we want to figure out the missing values of the column **cases per 100,000 people**, and our model will try to find these values by making predictions for them using the the information we do have available.\n",
+ "\n",
+ "- In contrast, in an **unsupervised** model, we do not know exactly what we want to predict. Instead, an unsupervised model finds hidden relationships between different types of information and can group them based on similarities. For example, Netflix surprising you with a new show you like.\n",
+ "\n",
+ "A **Decision Tree** can be pictured as a tree-like flowchart, where we start with a particular criteria and based on whether this is True (Y for Yes) or False (N for No), we chose only one of the branches. This process is then repeated at every decision until we reach the bottom of the tree, where we end up with a specific prediction. \n",
+ "\n",
+ "![General-Decision-Tree.png](images/General-Decision-Tree.png)\n",
+ "\n",
+ "We will see how a Decision Tree can help us predict the missing **cases per 100,000 people** in more detail later on in this tutorial.\n",
+ "\n",
+ "You can find more information about different ways to classify machine learning models here: [Machine Learning Models](https://www.geeksforgeeks.org/introduction-machine-learning/?ref=lbp)\n",
+ "\n",
+ "You can find more information about Decision Trees here: [Scikit-learn](https://scikit-learn.org/stable/modules/tree.html)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "d96c650c-4a96-486d-99db-16dba7ed9317",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/javascript": [
+ "var questionsMpQrXvjLDoZX=[\n",
+ " {\n",
+ " \"question\": \"Which of the following statements is the best description of the project you are working on here?\",\n",
+ " \"type\": \"multiple_choice\",\n",
+ " \"answers\": [\n",
+ " {\n",
+ " \"answer\": \"We are creating a model to better understand what determines the number of COVID cases in a county.\",\n",
+ " \"correct\": false,\n",
+ " \"feedback\": \"While we may gain insights into what is correlated with COVID cases in a county we will not know for certain what determines it. Please try again.\"\n",
+ " },\n",
+ " {\n",
+ " \"answer\": \"We are creating a model to be able to predict the number of COVID cases in a county.\",\n",
+ " \"correct\": true,\n",
+ " \"feedback\": \"Correct. Given data such as population, vaccination rates, and other information, our Decision Tree will predict the number of COVID cases for that county.\"\n",
+ " },\n",
+ " {\n",
+ " \"answer\": \"We are learning about the biology of COVID transmission.\",\n",
+ " \"correct\": false,\n",
+ " \"feedback\": \"There is no knowledge of disease transmission in this project. Please try again.\"\n",
+ " },\n",
+ " {\n",
+ " \"answer\": \"We are trying to determine whether SF county had more or less COVID cases than LA county.\",\n",
+ " \"correct\": false,\n",
+ " \"feedback\": \"We can verify this from the datasets themselves but this is not our aim in this project. Please try again.\"\n",
+ " }\n",
+ " ]\n",
+ " }\n",
+ "];\n",
+ " // Make a random ID\n",
+ "function makeid(length) {\n",
+ " var result = [];\n",
+ " var characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';\n",
+ " var charactersLength = characters.length;\n",
+ " for (var i = 0; i < length; i++) {\n",
+ " result.push(characters.charAt(Math.floor(Math.random() * charactersLength)));\n",
+ " }\n",
+ " return result.join('');\n",
+ "}\n",
+ "\n",
+ "// Choose a random subset of an array. Can also be used to shuffle the array\n",
+ "function getRandomSubarray(arr, size) {\n",
+ " var shuffled = arr.slice(0), i = arr.length, temp, index;\n",
+ " while (i--) {\n",
+ " index = Math.floor((i + 1) * Math.random());\n",
+ " temp = shuffled[index];\n",
+ " shuffled[index] = shuffled[i];\n",
+ " shuffled[i] = temp;\n",
+ " }\n",
+ " return shuffled.slice(0, size);\n",
+ "}\n",
+ "\n",
+ "function printResponses(responsesContainer) {\n",
+ " var responses=JSON.parse(responsesContainer.dataset.responses);\n",
+ " var stringResponses='IMPORTANT!To preserve this answer sequence for submission, when you have finalized your answers:
Copy the text in this cell below \"Answer String\"
Double click on the cell directly below the Answer String, labeled \"Replace Me\"
Select the whole \"Replace Me\" text
Paste in your answer string and press shift-Enter.
Save the notebook using the save icon or File->Save Notebook menu item
Answer String: ';\n",
+ " console.log(responses);\n",
+ " responses.forEach((response, index) => {\n",
+ " if (response) {\n",
+ " console.log(index + ': ' + response);\n",
+ " stringResponses+= index + ': ' + response +\" \";\n",
+ " }\n",
+ " });\n",
+ " responsesContainer.innerHTML=stringResponses;\n",
+ "}\n",
+ "function check_mc() {\n",
+ " var id = this.id.split('-')[0];\n",
+ " //var response = this.id.split('-')[1];\n",
+ " //console.log(response);\n",
+ " //console.log(\"In check_mc(), id=\"+id);\n",
+ " //console.log(event.srcElement.id) \n",
+ " //console.log(event.srcElement.dataset.correct) \n",
+ " //console.log(event.srcElement.dataset.feedback)\n",
+ "\n",
+ " var label = event.srcElement;\n",
+ " //console.log(label, label.nodeName);\n",
+ " var depth = 0;\n",
+ " while ((label.nodeName != \"LABEL\") && (depth < 20)) {\n",
+ " label = label.parentElement;\n",
+ " console.log(depth, label);\n",
+ " depth++;\n",
+ " }\n",
+ "\n",
+ "\n",
+ "\n",
+ " var answers = label.parentElement.children;\n",
+ "\n",
+ " //console.log(answers);\n",
+ "\n",
+ "\n",
+ " // Split behavior based on multiple choice vs many choice:\n",
+ " var fb = document.getElementById(\"fb\" + id);\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ " if (fb.dataset.numcorrect == 1) {\n",
+ " // What follows is for the saved responses stuff\n",
+ " var outerContainer = fb.parentElement.parentElement;\n",
+ " var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n",
+ " if (responsesContainer) {\n",
+ " //console.log(responsesContainer);\n",
+ " var response = label.firstChild.innerText;\n",
+ " if (label.querySelector(\".QuizCode\")){\n",
+ " response+= label.querySelector(\".QuizCode\").firstChild.innerText;\n",
+ " }\n",
+ " console.log(response);\n",
+ " //console.log(document.getElementById(\"quizWrap\"+id));\n",
+ " var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n",
+ " console.log(\"Question \" + qnum);\n",
+ " //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n",
+ " var responses=JSON.parse(responsesContainer.dataset.responses);\n",
+ " console.log(responses);\n",
+ " responses[qnum]= response;\n",
+ " responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n",
+ " printResponses(responsesContainer);\n",
+ " }\n",
+ " // End code to preserve responses\n",
+ " \n",
+ " for (var i = 0; i < answers.length; i++) {\n",
+ " var child = answers[i];\n",
+ " //console.log(child);\n",
+ " child.className = \"MCButton\";\n",
+ " }\n",
+ "\n",
+ "\n",
+ "\n",
+ " if (label.dataset.correct == \"true\") {\n",
+ " // console.log(\"Correct action\");\n",
+ " if (\"feedback\" in label.dataset) {\n",
+ " fb.textContent = jaxify(label.dataset.feedback);\n",
+ " } else {\n",
+ " fb.textContent = \"Correct!\";\n",
+ " }\n",
+ " label.classList.add(\"correctButton\");\n",
+ "\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"correct\");\n",
+ "\n",
+ " } else {\n",
+ " if (\"feedback\" in label.dataset) {\n",
+ " fb.textContent = jaxify(label.dataset.feedback);\n",
+ " } else {\n",
+ " fb.textContent = \"Incorrect -- try again.\";\n",
+ " }\n",
+ " //console.log(\"Error action\");\n",
+ " label.classList.add(\"incorrectButton\");\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"incorrect\");\n",
+ " }\n",
+ " }\n",
+ " else {\n",
+ " var reset = false;\n",
+ " var feedback;\n",
+ " if (label.dataset.correct == \"true\") {\n",
+ " if (\"feedback\" in label.dataset) {\n",
+ " feedback = jaxify(label.dataset.feedback);\n",
+ " } else {\n",
+ " feedback = \"Correct!\";\n",
+ " }\n",
+ " if (label.dataset.answered <= 0) {\n",
+ " if (fb.dataset.answeredcorrect < 0) {\n",
+ " fb.dataset.answeredcorrect = 1;\n",
+ " reset = true;\n",
+ " } else {\n",
+ " fb.dataset.answeredcorrect++;\n",
+ " }\n",
+ " if (reset) {\n",
+ " for (var i = 0; i < answers.length; i++) {\n",
+ " var child = answers[i];\n",
+ " child.className = \"MCButton\";\n",
+ " child.dataset.answered = 0;\n",
+ " }\n",
+ " }\n",
+ " label.classList.add(\"correctButton\");\n",
+ " label.dataset.answered = 1;\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"correct\");\n",
+ "\n",
+ " }\n",
+ " } else {\n",
+ " if (\"feedback\" in label.dataset) {\n",
+ " feedback = jaxify(label.dataset.feedback);\n",
+ " } else {\n",
+ " feedback = \"Incorrect -- try again.\";\n",
+ " }\n",
+ " if (fb.dataset.answeredcorrect > 0) {\n",
+ " fb.dataset.answeredcorrect = -1;\n",
+ " reset = true;\n",
+ " } else {\n",
+ " fb.dataset.answeredcorrect--;\n",
+ " }\n",
+ "\n",
+ " if (reset) {\n",
+ " for (var i = 0; i < answers.length; i++) {\n",
+ " var child = answers[i];\n",
+ " child.className = \"MCButton\";\n",
+ " child.dataset.answered = 0;\n",
+ " }\n",
+ " }\n",
+ " label.classList.add(\"incorrectButton\");\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"incorrect\");\n",
+ " }\n",
+ " // What follows is for the saved responses stuff\n",
+ " var outerContainer = fb.parentElement.parentElement;\n",
+ " var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n",
+ " if (responsesContainer) {\n",
+ " //console.log(responsesContainer);\n",
+ " var response = label.firstChild.innerText;\n",
+ " if (label.querySelector(\".QuizCode\")){\n",
+ " response+= label.querySelector(\".QuizCode\").firstChild.innerText;\n",
+ " }\n",
+ " console.log(response);\n",
+ " //console.log(document.getElementById(\"quizWrap\"+id));\n",
+ " var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n",
+ " console.log(\"Question \" + qnum);\n",
+ " //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n",
+ " var responses=JSON.parse(responsesContainer.dataset.responses);\n",
+ " if (label.dataset.correct == \"true\") {\n",
+ " if (typeof(responses[qnum]) == \"object\"){\n",
+ " if (!responses[qnum].includes(response))\n",
+ " responses[qnum].push(response);\n",
+ " } else{\n",
+ " responses[qnum]= [ response ];\n",
+ " }\n",
+ " } else {\n",
+ " responses[qnum]= response;\n",
+ " }\n",
+ " console.log(responses);\n",
+ " responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n",
+ " printResponses(responsesContainer);\n",
+ " }\n",
+ " // End save responses stuff\n",
+ "\n",
+ "\n",
+ "\n",
+ " var numcorrect = fb.dataset.numcorrect;\n",
+ " var answeredcorrect = fb.dataset.answeredcorrect;\n",
+ " if (answeredcorrect >= 0) {\n",
+ " fb.textContent = feedback + \" [\" + answeredcorrect + \"/\" + numcorrect + \"]\";\n",
+ " } else {\n",
+ " fb.textContent = feedback + \" [\" + 0 + \"/\" + numcorrect + \"]\";\n",
+ " }\n",
+ "\n",
+ "\n",
+ " }\n",
+ "\n",
+ " if (typeof MathJax != 'undefined') {\n",
+ " var version = MathJax.version;\n",
+ " console.log('MathJax version', version);\n",
+ " if (version[0] == \"2\") {\n",
+ " MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n",
+ " } else if (version[0] == \"3\") {\n",
+ " MathJax.typeset([fb]);\n",
+ " }\n",
+ " } else {\n",
+ " console.log('MathJax not detected');\n",
+ " }\n",
+ "\n",
+ "}\n",
+ "\n",
+ "function make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id) {\n",
+ " var shuffled;\n",
+ " if (shuffle_answers == \"True\") {\n",
+ " //console.log(shuffle_answers+\" read as true\");\n",
+ " shuffled = getRandomSubarray(qa.answers, qa.answers.length);\n",
+ " } else {\n",
+ " //console.log(shuffle_answers+\" read as false\");\n",
+ " shuffled = qa.answers;\n",
+ " }\n",
+ "\n",
+ "\n",
+ " var num_correct = 0;\n",
+ "\n",
+ "\n",
+ "\n",
+ " shuffled.forEach((item, index, ans_array) => {\n",
+ " //console.log(answer);\n",
+ "\n",
+ " // Make input element\n",
+ " var inp = document.createElement(\"input\");\n",
+ " inp.type = \"radio\";\n",
+ " inp.id = \"quizo\" + id + index;\n",
+ " inp.style = \"display:none;\";\n",
+ " aDiv.append(inp);\n",
+ "\n",
+ " //Make label for input element\n",
+ " var lab = document.createElement(\"label\");\n",
+ " lab.className = \"MCButton\";\n",
+ " lab.id = id + '-' + index;\n",
+ " lab.onclick = check_mc;\n",
+ " var aSpan = document.createElement('span');\n",
+ " aSpan.classsName = \"\";\n",
+ " //qDiv.id=\"quizQn\"+id+index;\n",
+ " if (\"answer\" in item) {\n",
+ " aSpan.innerHTML = jaxify(item.answer);\n",
+ " //aSpan.innerHTML=item.answer;\n",
+ " }\n",
+ " lab.append(aSpan);\n",
+ "\n",
+ " // Create div for code inside question\n",
+ " var codeSpan;\n",
+ " if (\"code\" in item) {\n",
+ " codeSpan = document.createElement('span');\n",
+ " codeSpan.id = \"code\" + id + index;\n",
+ " codeSpan.className = \"QuizCode\";\n",
+ " var codePre = document.createElement('pre');\n",
+ " codeSpan.append(codePre);\n",
+ " var codeCode = document.createElement('code');\n",
+ " codePre.append(codeCode);\n",
+ " codeCode.innerHTML = item.code;\n",
+ " lab.append(codeSpan);\n",
+ " //console.log(codeSpan);\n",
+ " }\n",
+ "\n",
+ " //lab.textContent=item.answer;\n",
+ "\n",
+ " // Set the data attributes for the answer\n",
+ " lab.setAttribute('data-correct', item.correct);\n",
+ " if (item.correct) {\n",
+ " num_correct++;\n",
+ " }\n",
+ " if (\"feedback\" in item) {\n",
+ " lab.setAttribute('data-feedback', item.feedback);\n",
+ " }\n",
+ " lab.setAttribute('data-answered', 0);\n",
+ "\n",
+ " aDiv.append(lab);\n",
+ "\n",
+ " });\n",
+ "\n",
+ " if (num_correct > 1) {\n",
+ " outerqDiv.className = \"ManyChoiceQn\";\n",
+ " } else {\n",
+ " outerqDiv.className = \"MultipleChoiceQn\";\n",
+ " }\n",
+ "\n",
+ " return num_correct;\n",
+ "\n",
+ "}\n",
+ "function check_numeric(ths, event) {\n",
+ "\n",
+ " if (event.keyCode === 13) {\n",
+ " ths.blur();\n",
+ "\n",
+ " var id = ths.id.split('-')[0];\n",
+ "\n",
+ " var submission = ths.value;\n",
+ " if (submission.indexOf('/') != -1) {\n",
+ " var sub_parts = submission.split('/');\n",
+ " //console.log(sub_parts);\n",
+ " submission = sub_parts[0] / sub_parts[1];\n",
+ " }\n",
+ " //console.log(\"Reader entered\", submission);\n",
+ "\n",
+ " if (\"precision\" in ths.dataset) {\n",
+ " var precision = ths.dataset.precision;\n",
+ " // console.log(\"1:\", submission)\n",
+ " submission = Math.round((1 * submission + Number.EPSILON) * 10 ** precision) / 10 ** precision;\n",
+ " // console.log(\"Rounded to \", submission, \" precision=\", precision );\n",
+ " }\n",
+ "\n",
+ "\n",
+ " //console.log(\"In check_numeric(), id=\"+id);\n",
+ " //console.log(event.srcElement.id) \n",
+ " //console.log(event.srcElement.dataset.feedback)\n",
+ "\n",
+ " var fb = document.getElementById(\"fb\" + id);\n",
+ " fb.style.display = \"none\";\n",
+ " fb.textContent = \"Incorrect -- try again.\";\n",
+ "\n",
+ " var answers = JSON.parse(ths.dataset.answers);\n",
+ " //console.log(answers);\n",
+ "\n",
+ " var defaultFB = \"\";\n",
+ " var correct;\n",
+ " var done = false;\n",
+ " answers.every(answer => {\n",
+ " //console.log(answer.type);\n",
+ "\n",
+ " correct = false;\n",
+ " // if (answer.type==\"value\"){\n",
+ " if ('value' in answer) {\n",
+ " if (submission == answer.value) {\n",
+ " fb.textContent = jaxify(answer.feedback);\n",
+ " correct = answer.correct;\n",
+ " //console.log(answer.correct);\n",
+ " done = true;\n",
+ " }\n",
+ " // } else if (answer.type==\"range\") {\n",
+ " } else if ('range' in answer) {\n",
+ " //console.log(answer.range);\n",
+ " if ((submission >= answer.range[0]) && (submission < answer.range[1])) {\n",
+ " fb.textContent = jaxify(answer.feedback);\n",
+ " correct = answer.correct;\n",
+ " //console.log(answer.correct);\n",
+ " done = true;\n",
+ " }\n",
+ " } else if (answer.type == \"default\") {\n",
+ " defaultFB = answer.feedback;\n",
+ " }\n",
+ " if (done) {\n",
+ " return false; // Break out of loop if this has been marked correct\n",
+ " } else {\n",
+ " return true; // Keep looking for case that includes this as a correct answer\n",
+ " }\n",
+ " });\n",
+ "\n",
+ " if ((!done) && (defaultFB != \"\")) {\n",
+ " fb.innerHTML = jaxify(defaultFB);\n",
+ " //console.log(\"Default feedback\", defaultFB);\n",
+ " }\n",
+ "\n",
+ " fb.style.display = \"block\";\n",
+ " if (correct) {\n",
+ " ths.className = \"Input-text\";\n",
+ " ths.classList.add(\"correctButton\");\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"correct\");\n",
+ " } else {\n",
+ " ths.className = \"Input-text\";\n",
+ " ths.classList.add(\"incorrectButton\");\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.classList.add(\"incorrect\");\n",
+ " }\n",
+ "\n",
+ " // What follows is for the saved responses stuff\n",
+ " var outerContainer = fb.parentElement.parentElement;\n",
+ " var responsesContainer = document.getElementById(\"responses\" + outerContainer.id);\n",
+ " if (responsesContainer) {\n",
+ " console.log(submission);\n",
+ " var qnum = document.getElementById(\"quizWrap\"+id).dataset.qnum;\n",
+ " //console.log(\"Question \" + qnum);\n",
+ " //console.log(id, \", got numcorrect=\",fb.dataset.numcorrect);\n",
+ " var responses=JSON.parse(responsesContainer.dataset.responses);\n",
+ " console.log(responses);\n",
+ " if (submission == ths.value){\n",
+ " responses[qnum]= submission;\n",
+ " } else {\n",
+ " responses[qnum]= ths.value + \"(\" + submission +\")\";\n",
+ " }\n",
+ " responsesContainer.setAttribute('data-responses', JSON.stringify(responses));\n",
+ " printResponses(responsesContainer);\n",
+ " }\n",
+ " // End code to preserve responses\n",
+ "\n",
+ " if (typeof MathJax != 'undefined') {\n",
+ " var version = MathJax.version;\n",
+ " console.log('MathJax version', version);\n",
+ " if (version[0] == \"2\") {\n",
+ " MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n",
+ " } else if (version[0] == \"3\") {\n",
+ " MathJax.typeset([fb]);\n",
+ " }\n",
+ " } else {\n",
+ " console.log('MathJax not detected');\n",
+ " }\n",
+ " return false;\n",
+ " }\n",
+ "\n",
+ "}\n",
+ "\n",
+ "function isValid(el, charC) {\n",
+ " //console.log(\"Input char: \", charC);\n",
+ " if (charC == 46) {\n",
+ " if (el.value.indexOf('.') === -1) {\n",
+ " return true;\n",
+ " } else if (el.value.indexOf('/') != -1) {\n",
+ " var parts = el.value.split('/');\n",
+ " if (parts[1].indexOf('.') === -1) {\n",
+ " return true;\n",
+ " }\n",
+ " }\n",
+ " else {\n",
+ " return false;\n",
+ " }\n",
+ " } else if (charC == 47) {\n",
+ " if (el.value.indexOf('/') === -1) {\n",
+ " if ((el.value != \"\") && (el.value != \".\")) {\n",
+ " return true;\n",
+ " } else {\n",
+ " return false;\n",
+ " }\n",
+ " } else {\n",
+ " return false;\n",
+ " }\n",
+ " } else if (charC == 45) {\n",
+ " var edex = el.value.indexOf('e');\n",
+ " if (edex == -1) {\n",
+ " edex = el.value.indexOf('E');\n",
+ " }\n",
+ "\n",
+ " if (el.value == \"\") {\n",
+ " return true;\n",
+ " } else if (edex == (el.value.length - 1)) { // If just after e or E\n",
+ " return true;\n",
+ " } else {\n",
+ " return false;\n",
+ " }\n",
+ " } else if (charC == 101) { // \"e\"\n",
+ " if ((el.value.indexOf('e') === -1) && (el.value.indexOf('E') === -1) && (el.value.indexOf('/') == -1)) {\n",
+ " // Prev symbol must be digit or decimal point:\n",
+ " if (el.value.slice(-1).search(/\\d/) >= 0) {\n",
+ " return true;\n",
+ " } else if (el.value.slice(-1).search(/\\./) >= 0) {\n",
+ " return true;\n",
+ " } else {\n",
+ " return false;\n",
+ " }\n",
+ " } else {\n",
+ " return false;\n",
+ " }\n",
+ " } else {\n",
+ " if (charC > 31 && (charC < 48 || charC > 57))\n",
+ " return false;\n",
+ " }\n",
+ " return true;\n",
+ "}\n",
+ "\n",
+ "function numeric_keypress(evnt) {\n",
+ " var charC = (evnt.which) ? evnt.which : evnt.keyCode;\n",
+ "\n",
+ " if (charC == 13) {\n",
+ " check_numeric(this, evnt);\n",
+ " } else {\n",
+ " return isValid(this, charC);\n",
+ " }\n",
+ "}\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "function make_numeric(qa, outerqDiv, qDiv, aDiv, id) {\n",
+ "\n",
+ "\n",
+ "\n",
+ " //console.log(answer);\n",
+ "\n",
+ "\n",
+ " outerqDiv.className = \"NumericQn\";\n",
+ " aDiv.style.display = 'block';\n",
+ "\n",
+ " var lab = document.createElement(\"label\");\n",
+ " lab.className = \"InpLabel\";\n",
+ " lab.textContent = \"Type numeric answer here:\";\n",
+ " aDiv.append(lab);\n",
+ "\n",
+ " var inp = document.createElement(\"input\");\n",
+ " inp.type = \"text\";\n",
+ " //inp.id=\"input-\"+id;\n",
+ " inp.id = id + \"-0\";\n",
+ " inp.className = \"Input-text\";\n",
+ " inp.setAttribute('data-answers', JSON.stringify(qa.answers));\n",
+ " if (\"precision\" in qa) {\n",
+ " inp.setAttribute('data-precision', qa.precision);\n",
+ " }\n",
+ " aDiv.append(inp);\n",
+ " //console.log(inp);\n",
+ "\n",
+ " //inp.addEventListener(\"keypress\", check_numeric);\n",
+ " //inp.addEventListener(\"keypress\", numeric_keypress);\n",
+ " /*\n",
+ " inp.addEventListener(\"keypress\", function(event) {\n",
+ " return numeric_keypress(this, event);\n",
+ " }\n",
+ " );\n",
+ " */\n",
+ " //inp.onkeypress=\"return numeric_keypress(this, event)\";\n",
+ " inp.onkeypress = numeric_keypress;\n",
+ " inp.onpaste = event => false;\n",
+ "\n",
+ " inp.addEventListener(\"focus\", function (event) {\n",
+ " this.value = \"\";\n",
+ " return false;\n",
+ " }\n",
+ " );\n",
+ "\n",
+ "\n",
+ "}\n",
+ "function jaxify(string) {\n",
+ " var mystring = string;\n",
+ "\n",
+ " var count = 0;\n",
+ " var loc = mystring.search(/([^\\\\]|^)(\\$)/);\n",
+ "\n",
+ " var count2 = 0;\n",
+ " var loc2 = mystring.search(/([^\\\\]|^)(\\$\\$)/);\n",
+ "\n",
+ " //console.log(loc);\n",
+ "\n",
+ " while ((loc >= 0) || (loc2 >= 0)) {\n",
+ "\n",
+ " /* Have to replace all the double $$ first with current implementation */\n",
+ " if (loc2 >= 0) {\n",
+ " if (count2 % 2 == 0) {\n",
+ " mystring = mystring.replace(/([^\\\\]|^)(\\$\\$)/, \"$1\\\\[\");\n",
+ " } else {\n",
+ " mystring = mystring.replace(/([^\\\\]|^)(\\$\\$)/, \"$1\\\\]\");\n",
+ " }\n",
+ " count2++;\n",
+ " } else {\n",
+ " if (count % 2 == 0) {\n",
+ " mystring = mystring.replace(/([^\\\\]|^)(\\$)/, \"$1\\\\(\");\n",
+ " } else {\n",
+ " mystring = mystring.replace(/([^\\\\]|^)(\\$)/, \"$1\\\\)\");\n",
+ " }\n",
+ " count++;\n",
+ " }\n",
+ " loc = mystring.search(/([^\\\\]|^)(\\$)/);\n",
+ " loc2 = mystring.search(/([^\\\\]|^)(\\$\\$)/);\n",
+ " //console.log(mystring,\", loc:\",loc,\", loc2:\",loc2);\n",
+ " }\n",
+ "\n",
+ " //console.log(mystring);\n",
+ " return mystring;\n",
+ "}\n",
+ "\n",
+ "\n",
+ "function show_questions(json, mydiv) {\n",
+ " console.log('show_questions');\n",
+ " //var mydiv=document.getElementById(myid);\n",
+ " var shuffle_questions = mydiv.dataset.shufflequestions;\n",
+ " var num_questions = mydiv.dataset.numquestions;\n",
+ " var shuffle_answers = mydiv.dataset.shuffleanswers;\n",
+ "\n",
+ " if (num_questions > json.length) {\n",
+ " num_questions = json.length;\n",
+ " }\n",
+ "\n",
+ " var questions;\n",
+ " if ((num_questions < json.length) || (shuffle_questions == \"True\")) {\n",
+ " //console.log(num_questions+\",\"+json.length);\n",
+ " questions = getRandomSubarray(json, num_questions);\n",
+ " } else {\n",
+ " questions = json;\n",
+ " }\n",
+ "\n",
+ " //console.log(\"SQ: \"+shuffle_questions+\", NQ: \" + num_questions + \", SA: \", shuffle_answers);\n",
+ "\n",
+ " // Iterate over questions\n",
+ " questions.forEach((qa, index, array) => {\n",
+ " //console.log(qa.question); \n",
+ "\n",
+ " var id = makeid(8);\n",
+ " //console.log(id);\n",
+ "\n",
+ "\n",
+ " // Create Div to contain question and answers\n",
+ " var iDiv = document.createElement('div');\n",
+ " //iDiv.id = 'quizWrap' + id + index;\n",
+ " iDiv.id = 'quizWrap' + id;\n",
+ " iDiv.className = 'Quiz';\n",
+ " iDiv.setAttribute('data-qnum', index);\n",
+ " mydiv.appendChild(iDiv);\n",
+ " // iDiv.innerHTML=qa.question;\n",
+ "\n",
+ " var outerqDiv = document.createElement('div');\n",
+ " outerqDiv.id = \"OuterquizQn\" + id + index;\n",
+ "\n",
+ " iDiv.append(outerqDiv);\n",
+ "\n",
+ " // Create div to contain question part\n",
+ " var qDiv = document.createElement('div');\n",
+ " qDiv.id = \"quizQn\" + id + index;\n",
+ " //qDiv.textContent=qa.question;\n",
+ " qDiv.innerHTML = jaxify(qa.question);\n",
+ "\n",
+ " outerqDiv.append(qDiv);\n",
+ "\n",
+ " // Create div for code inside question\n",
+ " var codeDiv;\n",
+ " if (\"code\" in qa) {\n",
+ " codeDiv = document.createElement('div');\n",
+ " codeDiv.id = \"code\" + id + index;\n",
+ " codeDiv.className = \"QuizCode\";\n",
+ " var codePre = document.createElement('pre');\n",
+ " codeDiv.append(codePre);\n",
+ " var codeCode = document.createElement('code');\n",
+ " codePre.append(codeCode);\n",
+ " codeCode.innerHTML = qa.code;\n",
+ " outerqDiv.append(codeDiv);\n",
+ " //console.log(codeDiv);\n",
+ " }\n",
+ "\n",
+ "\n",
+ " // Create div to contain answer part\n",
+ " var aDiv = document.createElement('div');\n",
+ " aDiv.id = \"quizAns\" + id + index;\n",
+ " aDiv.className = 'Answer';\n",
+ " iDiv.append(aDiv);\n",
+ "\n",
+ " //console.log(qa.type);\n",
+ "\n",
+ " var num_correct;\n",
+ " if (qa.type == \"multiple_choice\") {\n",
+ " num_correct = make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id);\n",
+ " } else if (qa.type == \"many_choice\") {\n",
+ " num_correct = make_mc(qa, shuffle_answers, outerqDiv, qDiv, aDiv, id);\n",
+ " } else if (qa.type == \"numeric\") {\n",
+ " //console.log(\"numeric\");\n",
+ " make_numeric(qa, outerqDiv, qDiv, aDiv, id);\n",
+ " }\n",
+ "\n",
+ "\n",
+ " //Make div for feedback\n",
+ " var fb = document.createElement(\"div\");\n",
+ " fb.id = \"fb\" + id;\n",
+ " //fb.style=\"font-size: 20px;text-align:center;\";\n",
+ " fb.className = \"Feedback\";\n",
+ " fb.setAttribute(\"data-answeredcorrect\", 0);\n",
+ " fb.setAttribute(\"data-numcorrect\", num_correct);\n",
+ " iDiv.append(fb);\n",
+ "\n",
+ "\n",
+ " });\n",
+ " var preserveResponses = mydiv.dataset.preserveresponses;\n",
+ " console.log(preserveResponses);\n",
+ " console.log(preserveResponses == \"true\");\n",
+ " if (preserveResponses == \"true\") {\n",
+ " console.log(preserveResponses);\n",
+ " // Create Div to contain record of answers\n",
+ " var iDiv = document.createElement('div');\n",
+ " iDiv.id = 'responses' + mydiv.id;\n",
+ " iDiv.className = 'JCResponses';\n",
+ " // Create a place to store responses as an empty array\n",
+ " iDiv.setAttribute('data-responses', '[]');\n",
+ "\n",
+ " // Dummy Text\n",
+ " iDiv.innerHTML=\"Select your answers and then follow the directions that will appear here.\"\n",
+ " //iDiv.className = 'Quiz';\n",
+ " mydiv.appendChild(iDiv);\n",
+ " }\n",
+ "//console.log(\"At end of show_questions\");\n",
+ " if (typeof MathJax != 'undefined') {\n",
+ " console.log(\"MathJax version\", MathJax.version);\n",
+ " var version = MathJax.version;\n",
+ " setTimeout(function(){\n",
+ " var version = MathJax.version;\n",
+ " console.log('After sleep, MathJax version', version);\n",
+ " if (version[0] == \"2\") {\n",
+ " MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n",
+ " } else if (version[0] == \"3\") {\n",
+ " MathJax.typeset([mydiv]);\n",
+ " }\n",
+ " }, 500);\n",
+ "if (typeof version == 'undefined') {\n",
+ " } else\n",
+ " {\n",
+ " if (version[0] == \"2\") {\n",
+ " MathJax.Hub.Queue([\"Typeset\", MathJax.Hub]);\n",
+ " } else if (version[0] == \"3\") {\n",
+ " MathJax.typeset([mydiv]);\n",
+ " } else {\n",
+ " console.log(\"MathJax not found\");\n",
+ " }\n",
+ " }\n",
+ " }\n",
+ " return false;\n",
+ "}\n",
+ "\n",
+ " {\n",
+ " show_questions(questionsMpQrXvjLDoZX, MpQrXvjLDoZX);\n",
+ " }\n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "!pip install jupyterquiz==2.0.7 --quiet\n",
+ "from jupyterquiz import display_quiz\n",
+ "\n",
+ "display_quiz('quiz_files/quiz1.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "170813e7-81c7-4224-9539-bcc8c36f9961",
+ "metadata": {},
+ "source": [
+ "# Prerequisites\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ad938220-3e5a-455f-a725-a8329672a0d8",
+ "metadata": {},
+ "source": [
+ "*** \\* If the libraries/packages have not been installed:***\n",
+ "\n",
+ " Use !pip install or \n",
+ "!conda install -c anaconda -y \n",
+ "\n",
+ " replacing with the library or package name you are trying to download "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "8e171f43-88c8-432e-8e57-f9f918f1beea",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "70bbc537-63d0-44f9-8705-0f0fb3dc494d",
+ "metadata": {},
+ "source": [
+ "### Step 1) Importing necessary packages into the notebook\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "7e8257e4-ef07-4b9b-bd9b-d4afd3f91548",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhwaGRoeHRsfIjAmIyIiIygnMCcyLjIyMy0tLy81PVBCNzhLOS0tRWFFS1NWW1xbNUFlbWRYbFBZW1cBERISGRYYLRoaMFc3NTZXV11XV1dXV1dXV1dXV1djV1dXV1dXV1dXV1dXWF9bV1dXV1dXV1dXV1dXV1dXY1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAQIDBgcEBf/EAEoQAAEDAQQDCQoOAgICAwEAAAABAhEDBBIhMUFRYQUHEyJTcZGS0RQWMmOBk6Gx4fAGFRcjMzVSVGJyc7LB0kJDgvEkoqPC8jT/xAAYAQEBAQEBAAAAAAAAAAAAAAAAAQIEA//EACERAQEBAQACAgIDAQAAAAAAAAARAQISIQMxQVEiMtEE/9oADAMBAAIRAxEAPwDn4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJjanpEbU9IEAmNqekRtT0gQCY2p6RG1PSBAJjanpEbU9IEAmNqekRtT0gQD6u4/wfr21HrRuQyEW86M5jRsPpd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7d4rr+wd4du8V1/YBrANn7w7d4rr+wd4du8V1/YBrANn7wrd4rr+wd4Vu8V1/YBrANn7wrd4rr+wd4Vu8V1/YBrANn7w7d4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANn7w7b4rr+wd4dt8V1/YBrANob8Aras40Uj8a9hb5P7b9qj117ANVBtD/AIBW1sStHH8a6PIfNt+4LrOiLVr2dFX/AAR7ld0I3Anllg+SD6G5W5D7W5WU6lJHaGvddV3NhifX7w7bKJNHH8a9g3cwawDZ0+Ads+1R669h590/gfarLQdXqrSuMibrlVcVRE0bRcHwATd2p6RG1PSUQCY2p6RG1PSBAJjanpEbU9IEAmNqekRtT0gQAAAAAAAAAAAAA37e08C0/mZ6nG5JM/5ac4jZtNS3rk+btX5mepxu1nermy5isXUsGdyjyotTU3pUlFfpRMs5PY1ZVeLEGKjUc5zkVqIibF1rhikaJw1mxgVXzgiRziX6mp5TNVrKlRjUYqo6ZciYNjXzlXV1RfAXPQEY5fjgmzEs29pjyFlrr9h3QVS0umODdzx0AGXoS9E6YyKtWpCyjdn8k91LyT+hCXWpUzpuzhMMyT2qk1NTelSZfhg3b7CXWtU/1vnVCFltCzhTd0IVGNOE/CS2/pgycOv2HRzDhlzuL/OgDHL9TS0uurle0aizqrtDJQhLQ5f9a7NoGWnN1Yi8SxakYokz6P8Asw90PT/BffyE90P+wpJ7VmR1SMkRechVqYwjVxwxMfdDo8BxHdLp8BwGaamODdntK/O/hMSWh/JqT3Q7DiO2gZZqfhLvV2F2M8Z1dp51ru0MXykrXdE3XfyBavMrGZipzdS9F6MYJ7od9h0jh3cm7oKiQXp1JnCI2Zl5AwiCzaiq5UVsJoWCEqrfu3cNYEKSOFXhLt3CJvC21nU2XmtvLCrEKswkomGsCqNhZDmykKebdi3VKNNjmI1FdmrkwTCY8p6rHXc+ztqOZDlbMe+sT1WM7zevD8vE6s5LRcTSiKRVtrmKt7SuEbMy9dItk6LiJ6/YeHdF6cKjtDWqvOqrCeo4vk3c3Zq7rHuzuutKlVrNiWMutn7S5elTV/gzuAlsa6tWl7lVc/SvSZPhHVVaFxHXYeiqir4a49qL5DbPg1YlZY6D21HQrEW5CRLsZyn0nv8A83P8fLfyZ16aB8JdwVsbmuaiok/9Kb38GrW+tYab3uvvaicbWmpdqZLzGL4S7nrWslSo+q5ODavFwuqqZLlJ8He9qvbw6SisWFicUVNm2V6FN/PznjWs1udJ2KbX4ev+D4/w3rTufWTWjfRUb2n02fSNVMpn+D4fw2cncD0iFlJ8rmqcXHXW7m7+/wDCuagA71AAAAAAAAAAAAAAAAAAAAAHRN65Pm7V+Znqcbqtop/bbnGaGlb16fNWpPxM9Tja+4Ho1Gte1LrbqSycOnM8++us/rla5zN+3srVWsarnLCduCFVtDdaxCLPPkUe1GU2suLUREiM8ETT0EOVtxruCnRdupKJzeQ3jK7bTTXJyLp7PWhTuunhisLMLGGCx/ApI24q8EqaFbGenIU3NesLSVMM3NSOnnKJW10pu3sebPRhrJS007quRZRM8NeRjvtauFF0zCKjE0YFnQ1Gu4Ncc0RMudEzAhbbTicYmEWM/eR3dSiZWFSUWM8+wjhm4rwD8vspMGR6o1EclOZTGESUAhlpY5yNSZVJy0BbVTRWoqxeRFTDOZj1E0XIsxTcyMcrs9BjZUa5UatF6T9pJT/oCyWykqxe9GXoLMtFNzkajuMsqic2ZV7qaOVFppzwnvpKpXpp/rjmanMBfuqlE3jJTe1zbyLhzGBtWkiSjM5whNMT77DLQqNXitbdRNiIBkbCpKE3SUSCQK3RdLACt0XSwArdF0sAK3RdLEARCC6fO3O3FZZkejHuVHNamKNTwZ1Ik5nr7l8ZU6wGa6Lpjo0Lizfc7D/JZMdqsjqj2uSq5iN0IiY4osr0IB6LpMGCrZbzlW+9qr9lYLpS4itvOWZxVcUkC6pJ5m7o0VqcElRL8xG3VOUjc2w9z01ZfV8umV0YImCeSedVU+LS3BqpWbKpca+9enFUmYiM8DWZm/bx+XvrmeOVk3VtEWtW6mNd03k/g+Tbay1KsN1J06PWef4aWhtO38Z7mzRb4Lo0vn+Og174wooquV1R2UQ/HTOnanQcfXxbvW7+2uud19ndKsymiuVzVliJdWMbyL6ZPofBDdnhaK2WpLVpQrXJpauUpsNLqWmyuWXNqq6c5Rde3mPbuTupZ6FZKjUqIqa3JDkxzOj4ePDJq5zGyfDDdTgrOtCnedwixeXLb6j5fwW3WpMoLZ2tu1nPlzljFNnMfH3Y3cqWysi1Fa1jfBa3JNu1T5VZUvShv5s8+dwzHX6Vvp3lZpa3BdZr/wANKzXWSpC4y31tNa3J3XhLtV+UXVX1GTdjdGnVoOa16KuGHlTsOPc68s5n1p7sa6ADrbAAAAAAAAAAAAAAAAAAAAAHQ9676K1fmZ6nG8oqLiiymxTRt69JpWpPxM9TjeG00aiI3BAD3xODlhfeClKveWLr0wzVIQ1/4X7rV7JTpLRfdVz3IstauCJtQ1fvwt/LJ5tnYWDpT6kIq3XLC9PMKNS/PFckfawOa9+Fv5ZPNs7B332/lk82zsESunQIOY999v5ZPNs7B332/lk82zsEK6dAg5j332/lk82zsHffb+WTzbOwkK6dAg5j332/lk82zsHffb+WTzbOwQrp0CDmPffb+WTzbOwd+Fv5ZPNs7Cwrp0CDmPffb+WTzbOwd99v5ZPNs7BCunQIOY999v5ZPNs7B332/lk82zsEK6dAg5j332/lk82zsHffb+WTzbOwQrp0CDmPffb+WTzbOwd99v5ZPNs7BFdOgQcx777fyyebZ2Dvvt/LJ5tnYIOnQIOY999v5ZPNs7B332/lk82zsESuk1aio1VTCNLpjPEwttPFVznshFSVRVw1nPF+F1u01m+bZ2H1Pg9urbbY+oxbQjEYy9hRprOOWgkG6tqS281ZRUwXEyaYhcs9HMfGdYrWjUVbauMYdzU9PMS6yWxHI3u52OnuenAV9SvVuNmHOVcEamaquj2kPrRgrXKuzE+PRpW11RzFtjmxOK0GQsKifyTwNr4PhO7luzhNnppO3mA+zTqo5VSHIqY4llWIwVZ1aOc+ZRqVqDEdXrOr33ta27Ta27M44ZofSXOJXLZo/wCwOY75X1gz9Fv7nGpG174//wDe3T8y39zjVAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOh716TStULC3m4+RxvDEVERHLK44mjb2LrtG1rqcz1ON5p1L2MKkYY+TtA1X4d2WpVpUEpU31FR7puNV0YaYNN+KbV92r+af2HWala5HFc6VXwUmCr7ZGVN6yiLlrLUjlPxTavu1fzT+wfFNq+7V/NP7DrHdXGu3X5xMYadPkIZa5VEuPRV/DgnlFI5R8U2r7tX80/sHxTavu1fzT+w6t3ZlxHwqavWO6/wP6BSOU/FNq+7V/NP7B8U2r7tX80/sOsOtMZsdEIuCTnrIda0SOI/GNGtJFI5R8U2r7tX80/sHxTavu1fzT+w6s+2XVxY/oLOtSIqpddgmr31ikcn+KbV92r+af2D4ptX3av5p/YdW7s/A/oDrXGdN8RoTaqfwKRyn4ptX3av5p/YPim1fdq/mn9h1d1qhVS4/OJghLaiqnFek5SnlFI5T8U2r7tX80/sHxTavu1fzT+w6l8ZNmODq9Tm7R8YtlESnVWYyblKTjqFI5b8U2r7tX80/sHxTavu1fzT+w6o62qj0ZwT8ViYwziStPdCXInBVExzVMNPv0ikct+KbV92r+af2D4ptX3av5p/YdQ+NWxPB1YWP8ADXEadpmo2y8ifNvSVjFMsvRj6BSOU/FNq+7V/NP7B8U2r7tX80/sOrd2J9ip1Q61wsXH6MYFI5T8U2r7tX80/sHxTavu1fzT+w6qttwng34zGGOEdvoLd1JE3H5xEYikco+KbV92r+af2F6VgttNZp0bSxVwljajV6UOqLac+I9Y2ZhLWkTcfnGQpHMOC3S1W7prDgt0tVu6ax01bZhNx/QelFlBSOU8Fulqt3TWHBbo6rd/8x1YkUan8CKdoiv3SlfNt3hb/wCKYveQ2ng01FwRXLN8j6wb+in7nmpm2b5H1g39FP3PNTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6HvXfRWr8zPU43s0PexYjqNrauSq1PQ43mlSRmCacdH8AVfVup4KriuWgq61IiTcf1SzqqtTwVdiuWgq61Qk3Hrsu4gQtqhfo3xE5EraoWLj8tCELalRfo3xE5EralRY4N+WhAC2nKGPx2bYxIW1x/hUjm9pK2ldFN/RtghbXGPB1MpyAllplUS49PIVS2p9ip1SzLTKolx6eQq22zjwdRP8AiAda1wim5VifYT3Z4up0J2hbUuHzb8kyTLCcR3X4t/QBL7WiLCsfz3cCUtH4H4pq58PR6SHWlU/1uVNEY6JxC2pYReDfqyyy7QIW1+LfPN/JK2rCbj1zyScpT+CHWuG3lY5MYx9ZjTdFJi6vPKaizRk7r8XU6AtriOI/FEXBNZjTdJq/4u9Hae0bkHlfbUbMseiJsTtLOtUKqXH4bMzOSQeZLVP+t+U5bSzbTP8Ag9MFXFIyM4A8SbpM+y6dWBLd0WLocmCrjGjynsBq4PGu6LJiHLnlGjyhN0Wqi4OwSdGJ6wLg8fxmzDB3oPRZ66VG3kw0GUgmwSACAAAAAA5XvkfWDf0k/c41Q2vfI+sG/pJ+5xqgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9Gi71v0dq/Mz1ON4eqokokqiLhr2AVVr9Ctjai9oiprZ0L2nnttpqsSmrKcq5eM2FcqbMME59mk81LdeormNdZ3orlamnTF5USMknSB9GKmtvQvaIqa29C9p5KlrqotRLqKqI9WpC43WtVq7cXKhhqbrVG52d2KqiYuxiPw6dHMB9GKmtvQvaIqa29C9p4V3Rqqi3bO5IVJvXslciYIjcVRFXZhmpVm6771Nr6Dmq9YRVVdcZRnpjVpA+hFTW3oXtEVNbehe08C7qVEWO5n6ccYyRU0beySV3UqSv/AI1TBc8ccJ1e/oA90VNbehe0RU1t6F7T57t13pKrZ3ojW3lVVVEi6rlzbnhHlJrbqvasJQc5UpteqJKql5VwwTZ7NIHviprb0L2iKmtvQvafPdupVvJ/477sO0OwVIhFw045Sm0pT3ac511LO+Uuyk4tvIvhJGGXp0ZAfTiprb0L2iKmtnQvaeJ26FSKapQct9FVyY8WNsaYwwxlCKe6FZzHTQVr2tRcnKk4ThCKsSuCZwuWEh7oqa2dC9oiprb0L2nz27qVYSbPUVcMYVJw1YxOSY88ZlLPuw+o7Cg5WXkbhKwqyrlVY0YJGvSB9OKmtvQvaIqa29C9p4XW+q1XJwSu491ERrkhJVEWcnSkLhCIUs+6dZ1RjVoORFSHKrXJjxcUwy8LPpA+jFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbOhe0ygDFFTWzoXtEVNbehe0ygDle+Qn/ntnPgW/ucaobbvlfWDf0W+txqQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9KaLvW/R2r8zPU43pQMda0NZF5YmfQVS1U1hL2KrCYKZXNRc0RecwVLRSa29gqI67gk45RzgT3ZTzvIWS1U1/yyx0nn+MbP8AbbpxjVE+tCzLXZ3Owc1VhdGjNfUBm7pZE3kjsDrTTRcXImnGdJTuiiqTLVTDRry9QW10daL5F7AJS2U8eNlnsJW100RFV6QuRVlopKt1I6OaE9Q7po5SnQvYBbuqmv8AlguxffSO66c+F6FKd1UdbYzmOZOwJaKLl0KqToywxAu62U0RXXsOZfQEtdOYnHmXErw9KUbhK5JAbaqToxSVmEVOkCe7aWHHTGPTkWW1U0iXJjihiWvRlJidHFXsLLXpJnGzDbEdIFltdNM3RhOM6cg21MV11Fx5l0leGpYLxdSLGrQGWmkvgwsJOCaE0oBZLXTWONnlgoS2U4m96FKJaaKzllK4ZImsnuilCrgqZLh5QLOtdNsorojYvo1k90slUvJKTPkzKJaKTtXlbs5h3VR1p0L2AWS101njYJpCWymuTp5kUqlopTCKkrGEa8glelE4Rlg3m7UAu61U0zciDulmPGyx6f8AtDGtpo629Bda1OVTDDPD31egA2101WEdjzKT3VTw4yY5Z4wVbaKU4KkxOXSV7qozdlNOjpQC/dVOFW8kTHlXINtdNUm9htRSFtFJM1RPIV7qo6015AXda6aIiq7BdOPMZKdVr0lqymRjdVpIqot3iwmWU4p6irbXRRMHIiLjMQmoD0gx0qzXzdWYwUyAct3yvrBv6LfW41I23fK+sG/ot/c41IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADom9b9HavzM9TjelNF3rfo7V+Znqcb1pAkpUcqJKJOJcwVKlS7LaeN6IVUy1gY2vW7PAxsw7CjXq3BtnhNER5SO660YWdU1S5DJTtNVVhaCtTHG8i6F/6Al1Vyf6ZTDJU98Bwjowo4zC4psy99BZtd6pjSVFwwva8+gqtoqaKK+V3sAqtV2iipKvWGxR0rqwjIltepeRFpKiZTJLq9SMKSzj/lz+/lAqtR2HzMpH882oMeqJhRiMETDLTo2GRar5WGYJljmUSvU5JekCFrO0UV2ZFkqOieCx99gSvU5L/29hKVn8muU56YyAq2qqtngoVFiF5uYjhn8gvShd1Z+inPlj+CvdFTkV6fYBLqrpjglXoKpUcuVGMFXHmy/gs6vUT/AFThPhadQdXqJHzSqqpOfSmQFeFdoo+/QWbVVVRFpKiTmOGqT9Hhz7J/glK1SY4JdON5AM11NSEcG2ZupK6YMK16kIvBKutJDa1RXIi0oTXOW0DPcTUnQLqakPPw9XksOfEnhqnJa/8ALVMaNOAGe4mpOgm6mpDzJaai/wCpen2Fn1amMU8lwxzAzXE1J0C4mpOgwNrvhZpLKJgk54htapMLS14z0AZ7qakFxNSdB51r1eS/9vYFtFSPoV63sA9FxMMEwywFxNSdBhbWeqoi04Rc1nI9AFWMRqQiIibCwAHLd8r6wZ+i39zjUjbd8r6wZ+i39zjUgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOib1v0dq/Mz1ON6XM0Xet+jtX5mepxvWkCSlVHKnFWF1xJWtTc6LrlaqFODqwnziT+VNYFeCrL/ALETyFm06ul6L5NmAbSqJE1Jx1JlqDaVRF+klNUIAbTqxi9FXDR0kJSq6aieROb398IWjV5X/wBULJSqyk1NP2UAOp1YSHomvDn9hV1KsuVRE8hbg6kfSafsplhh6yjaVaMaqT+VPXAF3U6sJD0mMcPSHU6mMPTPDBMCEpVNNT0JrMdWzVnKsVrqL+Hm9oGRtKrGNRJ0LHPoJSnVx46aI4uWvnMlFrkaiPdedpWInyGQDzqyrhx0zWcOiCG06uE1E0SkdJ6QB57lXHjoucJEcxRadaPpGzGUHrIAwOp1cYqJnKYaMcPV0EOp1c0emWrTpPSAPLwVblE6Cy06sIl9EWcVjP3xPQAIbMJOZIAAgkAAABBIAAAAAABy3fK+sGfot/c41I23fK+sGfot/c41IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADom9b9HavzM9TjetJou9b9HavzM9TjetIFKjXKqXXQiZpGZibSqpPziLhqTMm1UqjlatOpciZwmco99pSjSqtRyPq3p8FbqJGfT7ALcHW5ROqQ2lVhZqJOEYZa+cs1lSFRXoupYjpIRlWfDbHN/IBtOrhNRNvF6SblXHjps4uWrnJVtTjKjkyW6midElLlbDjtyx4s69vMBKUqvKJMzloxwCUqsLNRJ2ImvsLPbUnivRE1KkkOZUhsPSUTHDMBwdXH5xNnF0z2B1OroeiYahdqXUS+29OKxhHMFbUw4zc1lY0aAHB1ZTjpGnDPX77S1x8Lx8ZwwyMasqx9I3q+0m5Vu+G1XTOWERl0gWRlSPDSZ1YImohtOrpekxq9JaHTi5IjRnOsqjKiNXjorlyVUwTHVzAQtKroqaNKJrX38hK06krFTTpRNfYSjXwsuTRGEc5RKdblGzzbNCc4EvpVVSEqIixise+wm5VhEvpMrjGaaA9lWOK9qLCYqk46SrKdbTUavM2P5AlaVWVipgq6kwTUWuVJXjpGMYdAVlRWxeRHa0T+CEbVh0ubMcVY07UAcHVvIt9I0pHv7+iODq8onRgTwdX7adHP7CXsqSsORE0JH8gUbSqys1EWdnN/Eksp1kzqIv/Hm9pe7U4vGTRew6YK3Kv2m56tGj+AIWlVmeE8kYGamionGWVlcTExlRHYvRW6o/kqjK2l7dGhUx0geoHl4OtyjctDdJdjak4uaqTkiAZwefg6l1eOl5csME8haiyoirfcipohI99IGYAAct3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6JvW/R2n8zPU43pczRd636O1fmZ6nG9LmBjrUUfEqqRqVUKpZmxGOaLnqPHu1up3Kxrkajlc6OM661IRVxXyYHopWu/TpPhU4S6sLolJxAyNsrUmFdikeEoWzNVETGE2qU7qT7L8PwlnWhEVcHYamqKVLLK1FmXTEYuVQlmaiQk4rK44rhBRtqRckdqy5+wLa23UdxlRZyScljIUqzbI1Ecku4yQvGUMsrWrKK7yuVSO6UiYd0GWQMdSyNcqriirpRVT3yDrK1c1d4MeEvTz7TLIkDG+ytdne6VD7K1VVVVcY04YbPIZJEgYe4248Z2OmZ1diFnWVqqiqrsIwvKmU9pkkSBjWytVETGEWYnXiHWVquvYzM4KqauxDJIkDFUsjXZq7yOVDM1sJBEiQLgpIkC4KSJAuCkiQLgxyTIFwUkSBcFJEgXBSRIHMN8r6wZ+i39zjUjbd8r6wZ+i39zjUgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOib130dq/Mz1ON6NF3rfo7V+Znqcb0BSo1FwVEVNSpJitLobN+5tidCmdUIhAPEyrKfTzzN2kV66Nes2hGYxdupsg9qMRMkQKxFzhS77XXjY+61VWvM4NVWpguflwLuVVcjEqw5ExS7iuS+o9NxNhN1JnSEQCYEEEAmEEIBAJhBCAQCYEAQCYQQgEAmEEIBAJhBCAQCYQQgEAmEEIBAJgQBAJgQBAJgQBAJhBCAcw3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6JvW/R2n8zPU43rSaLvW/R2n8zPU43rSBR8ymrSYnOVEbLkTHHTOz2mZxSoixhHlAwNeqzFViokasOcNrJpqsiMNGrHamPpLIx3GljMctvOY+CfydKf+tnP6BDcX4Rc+EZd5v5nUQlVZT5xios886C7mOX/FmfuvOVZTcjV4jEXCI/kTCYq6oqZ1mJEpjCGZzXqiQ5EXXEpsMV16pxmU/WZFWpewRt3DXO0bhBG1NLk8iGVMsczBeq6mdK7PaJq6mL5V99Qgzg87XVpRFazaqKpa9V+yzpUDMAAAAAAAAAAAAAAAAAAAAAAAAAAOY75X1gz9Fv7nGpG275P1gz9Fv7nGpAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdE3rfo7V+Znqcb0aLvW/R2r8zPU43pQKudCwRwiYYpjltDkx5jG9sI2GosLhOjPLaBkSq3WnSEqJhimOKHldTXjLwDVVMuMmPowI46pC0Ey+0i6sPX0D2e3q4RM5TpJSomw8asdh8w2dCSm3T0GR0tdxaSLtlE99Iy6e9Z0qt1p0kpURclTpPKlKEVeBbMYJKYrqI4yz8ykLnKpjn/PrB7evhG606RwiZyhgRqpdTgkxxdCpgv8AJRtOXY0Wo2YmUnTj6ukJ7erhW606QlRFyhSnAtmbqShLKbW5IiAXkSQAqbwkgATIkgATIkgATIkgATIvEACZEkACZEkACZEkACZEkADmO+V9YM/Rb+5xqRtu+V9YM/Rb+5xqQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRN636O1fmZ6nG9KaJvW/R2r8zPU43p7kRJVURESVVdABUMdWijkhctiwUr21lNGKqyj8lbihSnunQcqIlVsrGGnjZSmjMCzbG1EjFedVUh9ia5XKqu4yznEYaOgLugxL83kuI5VlPsoir6FkvUttJqqjqjUVImdEx2p0oXdq1VLGzUvSume0z3Tys3VoKqpfRIVElclnKNhZ+6NJGtdKqxyql5MkuoqrM5QjVXyER6LounndulRRYWommdkRn0hu6dBf8AY1F1AeiBdPNU3Totc5rnxdiedZwTWvFUv3fRuq7hERGxenCJynUBmui6edu6NJXNa16OVyxhohFdj5EIXdShdV3CIqbNMzHTCgem6IPMzdOgv+xMpTakTPQpk7upcXjpxstuMevADLdF086bpUVmKjVW6roRcYTNfQvQQm6dLiqroR16FVI8BUa70r6FA9N0XTzLupQR13hEyVfImfrTpMj7bSb4T0TCfIoGW6Lp53bp0EzqsTGMVgy0LSyorkYs3YRcFTNJ0gXui6XAFLoulwBS6LpcAUui6XAFLoulwBS6LpcAct3yvrBn6Lf3ONSNt3yvrBn6Lf3ONSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA6HvW/R2r8zPU43tTQ969YpWpYnFmHkcb01Z0QBWrZ2Pi81Fu5SmXvCdBhTc2ijmqlNEu4pqlIhY8iGWtQR8SqoqZQse+RRbG2IVz1xnF3qAlLDTx4uDr0poW94WG0q7c6iudNq8+JKWNIi+/OZvZ4R/AWxpM3npzOzwgCG7nUUypNTyashS3PpNppTu3moqrxsfClF9DlTylu5EhEvvwSPCzDrIiqvGck6EUCF3Poyq8G2VmcM5z6SlTcui5E4iJCzhgZHWVFREvvwSMHR0krZkvXrz85icM5Aq/c+i5yuWm1VVZmNOfrI+LaF1W8E2FxVIzie1eksllRFRbz8FnwvWV7iT7dTrAWbYaSORyU2oqZLGyPVgVdubQWJpNwiMNWRZtkRFm8/D8Q7kSFRXOWdazELIFV3OowqcG1JSME1kJubRutarEcjEhL2K689ck9xJPhv63lDrEl1ER70jbrjPXkBZtipJlTanFVuWhZVU9K9IdYaSoiLTaqJMYfaWV6VJo2dGLKOcvOsmcDy/F9Hkm6dGvMs6w0lzptyRMtCZIZwBgWxUlwWm2ObZHqMlKgxk3Wok57YMhAEggASAAAIJAAEASAQBIAA5bvlfWDP0W/ucakbbvlfWDP0W/ucakAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB0Heyfdo2tUSVRWYeRxvFnqq9JVqtxyU4/uB8Ja+56VEotpu4RUVb6OXKYiFTWfW+Ua28lZ+q/wDuB0e1sYt1Xq5ImFSdPMYlZR8NXqqRdzlMf/1mc9XfFti50rMv/B/9yvyg2rkLL1H/ANwOiNZRR0JUWbqp4WSRiqEoylH0joR04uXQkdBzpN8G1JlQsqf8H/3HyhWvkLL1H/3A6IvAwnzi4JHhL6SrmUlbKVXJtny5eU558oFqmeAss/kf/cn5QrXyFl6j/wC4HRLtGVW+qXpwmPC2eUMSjj84qykYrOer30HO/lCtfIWXqP8A7kJvg2rkLL1H/wBwOivZSurx3IiKswu2VQU0pZpVcuGl2yP5Od/KFa+QsuvwH/3Cb4VrTKhZeo/+4HRGrR0Py25zpJppSTJ6wuGeGGPMc6+UK18hZeo/+4+UK1chZeo/+4HRHMpJM1H4LiiKq4qsaE14FHV7OqRwuSZouOO0598oVr5Cy4/gfz/bK9/1p+72THP5t39gOiNqUWw5Ky5xMynl6S1lZSeio2or+LC4zguEx5FOcJ8PLQk/+PZMVlfm35xH29RanvgWpuLaFlbzU3p/9gOldxtxxXHMllkaiKiTikLl2HN/lHtvJWfqv/uPlHtvJWfqv/uB0hLI3HjPxjGccCHWNqoiKrsNvP2nOPlHtvJWfqv/ALj5R7byVn6r/wC4HR+42zMuTLJYySCe5GxF5+SJnqWZ5zm/yj23krP1X/3Hyj23krP1X/3A6OljSIVzlSdfN2EtsiIs3nqsKmK6zm/yj23krP1X/wBx8o9t5Kz9V/8AcDpHcjYiXJzL5SO4m/af0nOPlHtvJWfqv/uPlHtvJWfqv/uB0dbG26jbz4SdOvAdxNu3ZdHPzZdBzj5R7byVn6r/AO4+Ue28lZ+q/wDuB0hbI2ESXJCRgse+ZelQRirCqs61k5p8o9t5Kz9V/wDcfKPbeSs/Vf8A3A6gQcw+Ue28lZ+q/wDuYvlBt/ieovaBffK+sGfot/c41I9+7O69W21Uq1rt5Go3ipCQiqv8ngAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('jPIQbpdTkbM', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c7d3fb42-e3f7-4b3b-a14e-c100c4382491",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "Before working on our model we need to import all packages and specific functions that we will need to use in order to work with our data. \n",
+ "\n",
+ "- **Packages** are essentially prepackaged code that others have made, that are often organized in chunks of code called modules. A package can contain many modules and these modules may contain several functions. \n",
+ "\n",
+ "- **Functions** are essentially a set of instructions to a computer that specify how to handle different types of files, what mathematical equations are used to calculate our model, how our graphs are going to be displayed, etc. \n",
+ "\n",
+ "The code in this notebook is organized in **cells**\n",
+ "\n",
+ "In the example below we will learn how to execute or \"run\" each of the three cells, so that our code actually takes effect. To run the code in a cell, select the cell and press the \"play\" button on the upper part of the notebook menu. \n",
+ "\n",
+ "**Note**: The lines of green text that are preceded by a \"#\" are called comments, they exist only to provide explanations of what each line or chunk of code does. They are not actually part of the code."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5ddada1e-aae9-4b22-87ae-022e0b96a053",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Data Wrangling Imports\n",
+ "import pandas as pd\n",
+ "import numpy as np"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a24d2226-35ca-451b-85d4-757c93eab78a",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Machine Learning Models Imports\n",
+ "from sklearn import tree\n",
+ "from sklearn.tree import DecisionTreeRegressor "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "6b29cad9-0cb2-40c8-959a-177b86019a3f",
+ "metadata": {
+ "scrolled": true,
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Model Evaluation Imports and Visualization\n",
+ "from matplotlib import pyplot as plt\n",
+ "!pip install graphviz\n",
+ "!conda install -c anaconda graphviz -y\n",
+ "import graphviz\n",
+ "# Quantitative metrics of Model performance\n",
+ "from sklearn.metrics import mean_squared_error"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ba92c612-e851-4c76-8551-4bf12daea579",
+ "metadata": {},
+ "source": [
+ "### Step 2) Loading training data and making sure it looks correct"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4f7561be-fa34-479e-95d1-62ef4bcdfc29",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('z9dcLYg65uk', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "002bea86-95cb-4076-b05a-aa34bcbceb15",
+ "metadata": {},
+ "source": [
+ "Now that we have our tools, we can now examine our dataset again. \n",
+ "\n",
+ "Recall that we are missing the last 18 values in the column \"cases per 100,000\", but we still have a big chunk of complete data (40 rows). This chunk of complete information is often referred to as **training data**.\n",
+ "\n",
+ "![Training-Data.jpg](images/Training-Data.jpg)\n",
+ "\n",
+ "**Training data** is a machine learning term that refers to the dataset used to teach our Decision Tree to make the predictions for our missing values using available data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "874fc63a-d033-410b-a080-adce8dd7185c",
+ "metadata": {},
+ "source": [
+ "**A)** Let's start by loading our training data into the notebook:\n",
+ "\n",
+ "The data that you need for these lessons are stored in a Google Cloud Storage bucket. Before you begin these files will need to be copied from the bucket to your notebook using the `gsutil` utility. For more information, see [NIH CloudLab's documentation](https://scan.cloud.nih.gov/resources/cloudlab/google-cloud-jumpstart/#cli)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3ea8f894-d160-40e1-b6a4-6f837a7ca963",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Download the data from Google Cloud Storage with gsutil\n",
+ "!gsutil cp gs://nigms-sandbox/nosi-sfsu/data/* ."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "caab38bd-bbb1-478b-b17c-8bd15bd07802",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the training data, data used to train the algorithm \n",
+ "S2020_training = pd.read_csv(\"S2020_training.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cbaf2e30-2eaf-4a21-9932-99d7bf8c22ba",
+ "metadata": {},
+ "source": [
+ "**B)** Make sure that your dataset is loaded correctly, it should contain the county names and all the data highlighted in green shown in our last picture:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3eabc08f-2002-4adb-b380-08c26ef968a6",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display the entire dataset \n",
+ "S2020_training"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d2eb357a-859b-446d-bfed-514d0c20f819",
+ "metadata": {},
+ "source": [
+ "**C)** We can sneak a peek at what our first 5 rows look like, if your dataset is too big to be displayed."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1d1b826c-2346-4452-80db-cc34a731856c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display only the first 5 rows of our dataset\n",
+ "S2020_training.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c1fafa09-54f7-422b-9f77-07d816beb038",
+ "metadata": {},
+ "source": [
+ "**D)** Here we can see how many rows and columns the complete dataset actually has. In our example we should have (40 rows, 11 columns)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "af0570f1-4334-4be4-90fc-08aa4c425cb0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This will display only the number of rows (not including the title of the columns) and number of colums of our dataset\n",
+ "S2020_training.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5e1e56f1-3454-4d23-8654-44d4a0d29f81",
+ "metadata": {},
+ "source": [
+ "### Step 3) Separate the training dataset into features and labels"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b261c100-55d5-4d4d-b052-16d033395e30",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('qh8C0QRECWU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "435f149e-a9fe-4b6b-be5d-3033fb6d84d0",
+ "metadata": {},
+ "source": [
+ "Recall that a Decision Tree is a **supervised** machine learning model, therefore we need to specify clearly what we are trying to predict.\n",
+ "\n",
+ "To do this we need to divide the training data into **labels** and **features**\n",
+ "\n",
+ "![Label-and-Features.jpg](images/Label-and-Features.jpg)\n",
+ "\n",
+ "- The RED outlined column is called a **LABEL**. This is a machine learning term that refers to the data that our model will learn to predict.\n",
+ "\n",
+ "- The BLUE outlined columns are called **FEATURES**, which is the term that refers to the columns we would like to use to predict our chosen LABEL. \n",
+ "\n",
+ "Because the **training data** is complete, we can clearly separate LABEL from FEATURES. Remember that the training data is only the red and blue shaded regions of our dataset. \n",
+ "\n",
+ "We can ignore the rest of the dataset for now.\n",
+ "\n",
+ "**A)** Separate the training data into features and labels:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7feef3b4-b0b8-41d0-b4b1-c1294753bd3f",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# On the other hand the label will only include summer_2020 cases per 100 000\n",
+ "S2020_training_labels = S2020_training[\"cases_per_100000\"]\n",
+ "\n",
+ "# Notice that in this code we are droping the \"county\" column, because it does not contribute with our predictions and \"cases_per_100000\" because that is our label\n",
+ "S2020_training_features = S2020_training.drop(columns=[\"county\",\"cases_per_100000\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "775e163b-ac56-40c8-a12c-89f13bfbce30",
+ "metadata": {},
+ "source": [
+ "**B)** Run the **LABEL** to check that the separation was correctly performed (you should see 40 rows and just 1 column):"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "49d72928-e153-4617-808d-9fb0140ef3dd",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This code allows you to see what the labels look like as a dataframe, after being separated from the training data\n",
+ "S2020_training_labels = pd.DataFrame(S2020_training_labels,columns = [\"cases_per_100000\"])\n",
+ "\n",
+ "# This code tells you how many rows and columns this dataset has\n",
+ "S2020_training_labels.shape"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "05b69eb8-d593-453b-9786-432dd425c612",
+ "metadata": {},
+ "source": [
+ "**C)** Run the **FEATURES** to check that the separation was correctly performed (you should see all 40 rows and 9 columns only since we dropped the columns of \"county\" and \"cases_per_100000\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1a75f401-1966-4628-b4fc-ee285bc8a595",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This code shows\n",
+ "S2020_training_features.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3c55a9ca-5d66-4bd2-aaa5-82ce6ee305a6",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "display_quiz('quiz_files/quiz2.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6a76b309-f312-425f-a4ac-e2ef7dacbfe0",
+ "metadata": {},
+ "source": [
+ "### Step 4) Create a Decision Tree object and train it"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d8b5abf2-94fe-4e2d-b24a-3b185b08c88e",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('M6gY_JywOys', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e82423a6-36cc-4898-aacb-727f98e9ffc1",
+ "metadata": {},
+ "source": [
+ "After separating our training data into features and labels, we can now create a Decision Tree. \n",
+ "\n",
+ "**A)** Create a Decision Tree object"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1af78ddd-dc00-4298-9732-94684886a7b7",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This line creates the Decison Tree with your chosen specifications (what is written within the parentheses)\n",
+ "dtr_summer2020 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9a337b10-ecaf-4b2e-921f-5b830ac4f7a2",
+ "metadata": {},
+ "source": [
+ "**B)** Train our Decision Tree using the training data we separated in the previous step"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ac3b1e35-0609-4f64-bac6-c42537cb3694",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This line trains the decision tree using both the features and the label from our training data\n",
+ "dtr_summer2020 = dtr_summer2020.fit(S2020_training_features,S2020_training_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dbaec858-5a15-4727-b9fb-ac1d6bd08fd8",
+ "metadata": {},
+ "source": [
+ "### Step 5) Visualize our trained Decision Tree"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c8c03608-f9d7-43d0-abd3-757b6196023e",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('cFk6vmfU48w', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "96d80ff1-924c-4aef-ad9d-68634911deb9",
+ "metadata": {},
+ "source": [
+ "Visualize our Decision Tree by graphing it using the following code "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "68789480-9f14-43ab-b49a-145bf462b132",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Initialize tree data object \n",
+ "dtr_summer2020_dot = tree.export_graphviz(dtr_summer2020, out_file=None, \n",
+ " feature_names=S2020_training_features.columns, \n",
+ " filled=False, rounded=True, impurity=False)\n",
+ "\n",
+ "# Draw graph\n",
+ "dtr_graph = graphviz.Source(dtr_summer2020_dot, format=\"png\") \n",
+ "dtr_graph"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4b6af3a1-3ba0-4759-9883-6d03123548ae",
+ "metadata": {},
+ "source": [
+ "### Let's try to understand what our tree learned!\n",
+ "\n",
+ "- **NODES** contain the decision that must be made based on a particular criteria. You can see that nodes have 2 arrows pointing away from them. All arrows to the LEFT are taken when the criteria is satisfied, and all arrows to the RIGHT are taken when this criteria is not satisfied.\n",
+ "\n",
+ "- **ROOT NODE**, this node is what our model determined as the most important feature to consider when making our predictions. It tells you the feature that best splits the data and it's located at the top of the tree.\n",
+ "\n",
+ "- **LEAVES** contain the final outcome of the decision path. You can see that leaves do not have arrows pointing away from them."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9dd48904-36b7-4747-b9e9-8b6c985ff049",
+ "metadata": {},
+ "source": [
+ "### Step 6) Make predictions using Testing data with our trained Decision Tree"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5d0d1906-bc34-4a59-b05f-99167175ac7c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('LtD93dB5JzU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c56def25-4c94-4de1-a0a8-02b5075988eb",
+ "metadata": {},
+ "source": [
+ "We are now ready to make predictions for the counties that had the missing labels.\n",
+ "\n",
+ "**Below is an image showing what constitutes the testing data in our example**\n",
+ "\n",
+ "![Testing-Data.jpg](images/Testing-Data.jpg)\n",
+ " \n",
+ "In machine learning we usually call the part of the dataset that only contains the FEATURE columns as **testing data**. \n",
+ "\n",
+ "The **testing data** is the dataset that is used to predict the missing values of the LABEL column, based on the rules learned during the training phase.\n",
+ "\n",
+ "Recall that our Decision Tree model has only been taught using the training data (40 counties) and has never seen any of the columns of the testing data (18 counties)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "20da60cd-48c3-4acb-be33-dc4565d3947a",
+ "metadata": {},
+ "source": [
+ "**A)** Let's load the testing data that correspond to the counties with the missing label and see what it looks like."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d5c75c05-745c-4c0c-bf31-c839e43c9ed5",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the testing data features, features = data used to make a prediction\n",
+ "S2020_testing_features = pd.read_csv(\"S2020_test_features.csv\")\n",
+ "\n",
+ "# This lets you see the loaded testing data \n",
+ "S2020_testing_features"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "046ef181-0158-4fb3-a726-42046fa4e713",
+ "metadata": {},
+ "source": [
+ "**B)** Lets drop the county names from the dataset and make our predictions!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c9062cad-cde4-4596-a7e5-08203fd77dd7",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This drops the \"county\" column from our test dataset\n",
+ "S2020_features_test_nocounty = S2020_testing_features.drop(columns=[\"county\"])\n",
+ "\n",
+ "# This uses the tree we created and makes the predictions\n",
+ "S2020_labels_pred = dtr_summer2020.predict(S2020_features_test_nocounty)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "33e5fb42-21b8-4983-ade9-83182ef070a4",
+ "metadata": {},
+ "source": [
+ "**C.1)** Let's look at what labels our model predicted and check how it relates to our Decision Tree:\n",
+ "\n",
+ "![COVID-Decision-Tree.PNG](images/COVID-Decision-Tree.PNG)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2b43dd45-625b-4495-9cb8-2bd08e459d75",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This turns our predictions (which is currently an array) into a dataframe \n",
+ "S2020_labels_preds_df = pd.DataFrame(S2020_labels_pred, columns=[\"Predicted\"])\n",
+ "\n",
+ "# This line adds the county name back, so that you can see what was predicted for each county\n",
+ "S2020_labels_preds_df = pd.concat([S2020_testing_features[\"county\"].reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# This lets us see what was predicted\n",
+ "S2020_labels_preds_df.round(3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9c5c9ddc-f972-424a-baaf-e5876ff0d847",
+ "metadata": {},
+ "source": [
+ "**C.2)** Why did the model predict 702.806 for San Francisco County?\n",
+ "\n",
+ "Run the cell bellow and look at the output, follow the tree as described in the video to see that this county has: \n",
+ "- Unemployment Rate =< 0.123\n",
+ "- Population > 28453.0\n",
+ "- Green_votes_percentage > 0.005\n",
+ "\n",
+ "Feel free to try another county and check for yourself that it follows these rules, by changing the county name in the code below:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ff7dd8fa-a9b3-4570-923a-882f198d53b3",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Loading the testing features for San Francisco County\n",
+ "S2020_testing_features[S2020_testing_features['county']=='San Francisco'] # change 'San Francisco' to any other county in the list above that you are interested in"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f295afb6-142f-4d8e-9ef3-5658916f3624",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "display_quiz('quiz_files/quiz3.json')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "aff22ea7-270b-47dd-ba7f-a2a62ba1cfe5",
+ "metadata": {},
+ "source": [
+ "### Step 7) Let's see how our Decision Tree model performed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9a11c706-b8ba-4b5c-bfde-eed545a00248",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('0VK4sLz2wrc', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a9d8738a-f8b2-4679-9ed5-d7748aaf6bf9",
+ "metadata": {},
+ "source": [
+ "Now that we have predicted the missing labels for Summer 2020 cases, let's see how our model did by comparing it with the actual labels!\n",
+ "\n",
+ "**A)** Let's reveal now our ACTUAL labels by loading them into the notebook"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3cd40255-1bd5-4d11-8f6f-1b9bd29718c0",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This opens the file that contains the testing data labels, label = what we want to predict\n",
+ "S2020_testing_labels = pd.read_csv(\"S2020_test_labels.csv\")\n",
+ "\n",
+ "# This drops the county on our label data so that the dataframe only has one column with county names when is joined with the predicted dataframe\n",
+ "S2020_testing_labels = S2020_testing_labels.drop(columns=[\"county\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4037a417-05e3-43e5-b63e-a3562da3c44b",
+ "metadata": {},
+ "source": [
+ "**B)** We can use a bar graph to help us visually inspect how our model performed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e7cc9c31-b0cd-4bd6-a544-4c0b143d50c8",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This puts into a single dataframe our predictions with our original test labels \n",
+ "pred_vs_test_2020 = pd.concat([S2020_testing_labels.reset_index(drop=True),S2020_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# Reorganize the order of columns\n",
+ "pred_vs_test_2020 = pred_vs_test_2020.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "\n",
+ "# This plots the data in a barchart per county\n",
+ "pred_vs_test_plot = pred_vs_test_2020.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"darkblue\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "0ae29213-1635-4a51-9593-20a59bf2a60e",
+ "metadata": {},
+ "source": [
+ "### Step 8) Let's try using our Summer 2020 tree model to predict 2021 data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fee74988-b620-4941-89f1-1d6ebb5ef0cc",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('2r3ZpwM6xDQ', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e6d838dc-e9fe-4721-b866-07068f5126b3",
+ "metadata": {},
+ "source": [
+ "**A)** Let's load the features information for the same 18 counties, but this time for Summer 2021."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3712f5eb-75ad-4f96-9106-9b51a19c0831",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Importing Summer 2021 data to predict using \"Summer2020 Model\"\n",
+ "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
+ "\n",
+ "# Make predictions for Summer 2021 Data\n",
+ "S2021_labels_pred = dtr_summer2020.predict(S2021_testing_features.drop(columns=[\"county\"]))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1c8d062d-42b2-419c-a703-a3a4537fdfb0",
+ "metadata": {},
+ "source": [
+ "**B)** Let's now load the actual Summer 2021 data and see how our 2020 Decision Tree model performed this time."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7212fb89-7ff6-488c-907e-0177ccacb6bf",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Importing labels of Summer 2021 data to check accuracy of \"Summer2020 Model\" predicting Summer2021 Data\n",
+ "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
+ "\n",
+ "# This turns our predictions (which is currently an array) into a dataframe \n",
+ "S2021_labels_preds = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
+ "\n",
+ "# This puts into a single dataframe our predictions with our original test labels \n",
+ "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds.reset_index(drop=True)],axis=1)\n",
+ "\n",
+ "# Visualize performance for Summer 2021 predictions\n",
+ "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"orange\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5cc98d6b-7118-4d3f-a144-a8b8e55d2cde",
+ "metadata": {},
+ "source": [
+ "**C)** Another way to look at the difference in performance between predictions made by the model for 2020 vs 2021 data is to observe their difference in errors.\n",
+ "\n",
+ "We can see that for 2020 the histogram (Blue) of errors is closer overall to 0 ranging from -500 to 500, whereas the histogram of errors for 2021 (Orange) are all over the place ranging from -1000 to 2500"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "332a67fd-30ec-4acd-bba3-7ce9855a3786",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Create columns holding error between actual rate vs. predicted rate\n",
+ "pred_vs_test_2020['residual'] = pred_vs_test_2020['cases_per_100000'] - pred_vs_test_2020['Predicted']\n",
+ "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
+ "\n",
+ "# Plot errors on histogram\n",
+ "plt.title('Cases per 100k Prediction Errors')\n",
+ "plt.hist(pred_vs_test_2020['residual'], alpha=0.5, label='2020 data')\n",
+ "plt.hist(pred_vs_test_2021['residual'], alpha=0.5, label='2021 data')\n",
+ "plt.legend(loc='upper right')\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "78323de1-ec64-43d3-a21d-cf93d97db995",
+ "metadata": {},
+ "source": [
+ "**D)** A more formal way to calculate the performance for the model is to calculate the Root Mean Square Error (RMSE). Feel free to browse the **(Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data** for more details about this particular metric."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "374a7f45-8bea-4b97-9ba5-12eec94e34a5",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2020 Data\n",
+ "print(f\"RMSE on 2020 test set: {mean_squared_error(pred_vs_test_2020['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2a4fe185-6b1c-47e0-ba16-4a181d47661c",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# This prints the RMSE value for the performance of the model using 2020 Data\n",
+ "print(f\"RMSE on 2021 test set: {mean_squared_error(pred_vs_test_2021['cases_per_100000'], pred_vs_test_2020['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ef11332-1228-4fe5-97f0-a4fb5ae924bf",
+ "metadata": {},
+ "source": [
+ "#### Please run the additional cell below to save a csv copy of the predicted and actual values made by our 2020 model for the years (2020 and 2021)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e2ba12ba-dfb4-4b44-98c0-e2fdb0226a86",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Lets's save our comparison dataframes as a CSV file for a more quantitative analysis\n",
+ "# We will revisit these in the next notebook.\n",
+ "pred_vs_test_2020.to_csv('Model2020pred_vs_test_2020.csv', encoding='utf-8',index=False)\n",
+ "pred_vs_test_2021.to_csv('Model2020pred_vs_test_2021.csv', encoding='utf-8',index=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3232e1d1-6190-4dcc-8747-621fd78ccbfd",
+ "metadata": {},
+ "source": [
+ "# Conclusion\n",
+ "\n",
+ "Congratulations on completing the \"Introduction to Machine Learning: Decision Trees\" module! Throughout this noteboook, you have gained a foundational understanding of Decision Trees, a fundamental machine learning technique. By working with real-world COVID-19 data from California, you have learned how to: \n",
+ "\n",
+ "1. **Understand Decision Trees:** \n",
+ " Recognize how Decision Trees function as supervised machine learning models, making predicitons based on learned decision rules.\n",
+ "2. **Prepare Data:** Load, inspect, and preprocess data to separate it into features and labels, crucial steps for training machine learning models. \n",
+ "3. **Train a Model:** \n",
+ " Create and train a Decision Tree model using the training dataset. \n",
+ "4. **Visualize and Interpret:**\n",
+ " Visualize the trained Decision Tree and understand the decision-making process at each node. \n",
+ "5. **Make Predictions**\n",
+ " Use the trained model to predict missing vlaues in the dataset and evaluate its performance. \n",
+ "6. **Evalue Performance:**\n",
+ " Compare the model's predictions to actual values using visualizations and quantitative metrics such as Root Mean Square Error (RMSE). \n",
+ " \n",
+ "This module has equipped you with the skills to apply Decision Trees to various datasets and understand their potential and limitations. Remember, Decision Trees are the foundation for more complex models like Boosted Trees and Random Forests. \n",
+ "\n",
+ "We hope you found this module informative and engaging. Keep experimenting with different datasets and machine learning techniques to further enhance your skills. Happy learning! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5f15de83-339b-4d21-94c2-9cd5a5be6a92",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Clean up \n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b473022e-a62b-4f82-a7cf-565dd2691168",
+ "metadata": {},
+ "source": [
+ "## *Acknowledgments*\n",
+ "\n",
+ "This notebook was created by Lucy Moctezuma Tan, Florentine van Nouhuijs, Lorena Benitez-Rivera (SFSU master's students and CoDE lab members), and Pleuni Pennings (SFSU bio professor).\n",
+ "Special acknowledgment to Faye Orcales for pulling the COVID data tables from government websites.\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m114",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m114"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb b/Google Cloud/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
similarity index 85%
rename from 2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
rename to Google Cloud/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
index 44d9bd3..ae85dfd 100644
--- a/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
+++ b/Google Cloud/2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data .ipynb
@@ -5,8 +5,9 @@
"id": "8807397b-a776-4baf-b739-178e87bfbfa4",
"metadata": {},
"source": [
- "# Quantitative Comparison of 2020 Decision Tree Model using (2020 vs 2021) Data Features\n",
+ "# **Quantitative Comparison of 2020 Decision Tree Model using (2020 vs 2021) Data Features**\n",
"\n",
+ "# Overview \n",
"Let's take a look at the bar graphs we have for both Decision Tree models (2020 vs 2021) created in the previous notebook:\n",
"At first glance we can see that in general the model had more accurate predictions in 2020 than in 2021, more of the yellow bars (predictions) are similar to the blue ones (2020 data); whereas in 2021, most of the yellow bars (predictions) are different from the teal colored bars (2021 data). \n",
"\n",
@@ -15,20 +16,94 @@
"![Model-performance-comparison.jpg](images/Model-performance-comparison.jpg)"
]
},
+ {
+ "cell_type": "markdown",
+ "id": "796c3b6f-919e-4225-bcf6-b8a6648e4e62",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives\n",
+ "\n",
+ "- Compare Model Performance\n",
+ " - Evalue the performance of a decision tree model using data fom different years (2020 vs. 2021).\n",
+ " - Interpret bar graphs and other visualizations to understand model accuracy. \n",
+ "- Calculate and Interpret RMSE\n",
+ " - Calculate the Root Mean Square Error (RMSE) to quantify model performance. \n",
+ " - Compare RMSE values to determine which dataset the model performs better on. \n",
+ "- Analyze Data Differences \n",
+ " - Examine summary statistics and distributions of training data drom differnt years.\n",
+ " - Identify how changes in data features (e.g., vaccination rates, unemployment rates) impact model performance.\n",
+ "- Understand Correlations and Trends \n",
+ " - Analyze correlations between features and the target variable (cases per 100K) for different years. \n",
+ " - Use scatterplots and trend lines to visually inspect relationships between variables. \n",
+ "- Identify Causes of Model Performance Changes\n",
+ " - Understand the concept of data drift and how it affects model performance over time. \n",
+ " - Recognize the importance of retraining models to adapt to changes in data distibutions and relationships. \n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dcfb00d9-4ee7-4a9c-8e0b-2b1c02420b5e",
+ "metadata": {},
+ "source": [
+ "# Prerequisites "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "364c2871-9d2b-4ee5-9b32-c1c0ab4d1bfa",
+ "metadata": {},
+ "source": [
+ "***Modules*** \n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "- Model2020pred_vs_test_2020.csv (module 1)\n",
+ "- Model2020pred_vs_test_2021.csv (module 1)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "65a13b55-467d-4c48-93d2-2f9e71c5752d",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Get Started"
+ ]
+ },
{
"cell_type": "markdown",
"id": "dd0b4ee2-8014-433f-bf5e-4c6bb820000b",
"metadata": {},
"source": [
- "### 1) Import libraries needed to examine the differences in performance of Summer 2020 model"
+ "### Step 1) Import libraries needed to examine the differences in performance of Summer 2020 model"
]
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 2,
"id": "4dfbffc7-aa04-4703-aacf-c8f7e0da70dd",
- "metadata": {},
- "outputs": [],
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "Matplotlib is building the font cache; this may take a moment.\n",
+ "/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/seaborn/_statistics.py:32: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.22.4)\n",
+ " from scipy.stats import gaussian_kde\n"
+ ]
+ }
+ ],
"source": [
"# Data Wrangling Imports\n",
"import pandas as pd\n",
@@ -48,7 +123,7 @@
"id": "3a60d7c5-ae36-4dc6-b3d9-abef5e2f44a3",
"metadata": {},
"source": [
- "### 2) Let's load the dataframes showing the differences between actual data and predictions for both years\n",
+ "### Step 2) Let's load the dataframes showing the differences between actual data and predictions for both years\n",
"\n",
"**A)** Actual vs Predictions for 2020 data made by our model \n",
"\n",
@@ -96,7 +171,7 @@
"id": "2dcf3bc7-9336-4dd6-ac8f-0178b3fa55ff",
"metadata": {},
"source": [
- "### 3) Calculating the Root Mean Square Error (RMSE)\n",
+ "### Step 3) Calculating the Root Mean Square Error (RMSE)\n",
"\n",
"**Root Mean Square Error (RMSE):** is a measurement that shows us how far apart our Predicted values are from our Actual values on average. The lower it is the better the performance of the model. The RMSE allows us to see the average error using the units of our label (Cases_per_100K).\n",
"\n",
@@ -145,7 +220,7 @@
"tags": []
},
"source": [
- "### 4) Why do you think the accuracy of the model decreased from year 2020 to year 2021?\n",
+ "### Step 4) Why do you think the accuracy of the model decreased from year 2020 to year 2021?\n",
"\n",
"Machine learning models depend highly on the data they were trained on, therefore to understand why the performance decreased, we can take a look at the differences between the training data from year 2020 versus the one from 2021 \n",
"\n",
@@ -402,7 +477,8 @@
"id": "3e5f931f-52d2-46a4-be19-d8456fd0be9e",
"metadata": {},
"source": [
- "### 5) Reasons that our 2020 model performed worse when predicting 2021 Data\n",
+ "# Conclusion\n",
+ "### Reasons that our 2020 model performed worse when predicting 2021 Data\n",
"\n",
"All in all we can see that the training data variables in the years 2020 and 2021 do not have the same relationship with our target variable (cases_per_100k) as showcased by the trend lines we see in our scatterplots. There are different kinds of changes that can happen in real life once our model is deployed, this is why it's important to to retrain our model from time to time. For more information about the different kinds of changes (or drifts) that could occur over time check out this useful [website](https://arize.com/model-drift/?utm_source=google&utm_medium=cpc&utm_campaign=18216725893&utm_content=139136719885&utm_term=data%20drift&utm_term=data%20drift&utm_campaign=Monitor+ML+-+Search&utm_source=adwords&utm_medium=ppc&hsa_acc=9379871348&hsa_cam=18216725893&hsa_grp=139136719885&hsa_ad=620214790996&hsa_src=g&hsa_tgt=kwd-328660210229&hsa_kw=data%20drift&hsa_mt=e&hsa_net=adwords&hsa_ver=3&gclid=CjwKCAjw-L-ZBhB4EiwA76YzOet2ULiqRzKwwxgXsCJhh7NgueokMbk9sBee2XAX4WtP4aaEMxPrIxoCHRsQAvD_BwE).\n",
"\n",
@@ -417,12 +493,23 @@
]
},
{
- "cell_type": "code",
- "execution_count": null,
- "id": "03126bad-6ca9-4493-be0e-bcfef6958a00",
+ "cell_type": "markdown",
+ "id": "275552df-52a9-4416-b2ca-065db3bd251d",
"metadata": {},
- "outputs": [],
- "source": []
+ "source": [
+ "# Clean up\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "57f1df8d-310a-4a37-b5b0-29a97300239a",
+ "metadata": {},
+ "source": [
+ "To keep your workspace organized remember to: \n",
+ "\n",
+ "1. Save your work. \n",
+ "2. Close any notebooks and active sessions to avoid extra charges. "
+ ]
}
],
"metadata": {
@@ -433,9 +520,9 @@
"uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
},
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "conda_python3",
"language": "python",
- "name": "python3"
+ "name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
@@ -447,7 +534,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.12"
+ "version": "3.10.14"
}
},
"nbformat": 4,
diff --git a/Google Cloud/3- Practice.ipynb b/Google Cloud/3- Practice.ipynb
new file mode 100644
index 0000000..d3867e4
--- /dev/null
+++ b/Google Cloud/3- Practice.ipynb
@@ -0,0 +1,318 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "423e9402-a82b-4834-a3a0-931a2c685ba7",
+ "metadata": {},
+ "source": [
+ "# **Practice: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "302b6319-f7a7-4d82-a9d9-8d1ffb16858f",
+ "metadata": {},
+ "source": [
+ "# Overview \n",
+ "In this module, you will put to practice what you have learned in the ***Introduction to Machine Learning: Decision Trees*** by creating, training, and evaluating a decision tree module using Summer 2021 data, enhancing your ability to adapt machine learning models to new datasets and assess their performance. \n",
+ "\n",
+ "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
+ "\n",
+ "* You DON'T need to copy-paste the comments from the original code (the green text that is preceded by \"#\"). \n",
+ "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 data.\n",
+ "\n",
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "536f933f-a925-4fe9-945b-87c48fb98ecb",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGBoeHRsfIiomIyIhIiolJigoMCkyMC0oLS01PVBCNThLOS0tRWFFS1NWW1xbMkJlbWRYbFBZW1cBERISGRYYLhsbLVc/NT1XV1dXV1dXV1dXV2NXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV11XXVdXV1deXVddV1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAgMEAQUGB//EAEAQAAICAAMFBAcFBwQCAwEAAAABAhEDEiEEEzFBUQVhcZEUIlKBodHwFSMyU7EzQnKSweHxBkNiohayY3PSJP/EABcBAQEBAQAAAAAAAAAAAAAAAAABAgP/xAAfEQEBAQADAQACAwAAAAAAAAAAARECEiExIkEDMmH/2gAMAwEAAhEDEQA/APz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHtYfZMZZqy+rxuT+BN9iKr9Tg3+N8iMYt8FZLCwnJ1FfFLnXFgRxOx1Gry6utJPz8Dq7FtJqMdVf4tfAji+paejTpoi27rI76VqEWy7Eq7jGkm/x9PeRl2OlVpK+HrHHCaV7uVacuvA5Uvy5eTCox7Ni1N5V6it+t+nUv7I7A9LxJwhKMMqu5X1Kmp6fdy1dLR8UIRxM3qRxIt+zcScpbPCIfZDvESyvJNxfrU20+KT5HfsWfJRfhNfXMjuZSf4Jt+DbvyI7r/jLQsCPZjbrS7a/FwaaWr95P7Hl7K/mRBYd/usLD/wCLCJR7JbWijxaazrkc+ynTdR0bT9bXQ48L/iwsPnTCq47EnCU6VRaT1116dTX2b2J6S5qMoxyK3mdX4FO6fsv4HYxktUpL30WZvqX/AByHZbcM6y1dfi14tcPd8UTfYsk69T3TRDdf8X8BuX7LIK8fYd20pLirVO9CrcR6GlYT9mRxw7mBn3Eehds3Z+9bUa0TerrgT3Tq8rORhfCMgqv0JZM9LLmy8dbq+BLA7P3jko16sXJ3JR0XiyW7/wCMju7fsyAzbiPQbiPQ0ShXFNEdO8Ip3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvCqdxHoNxHoXad407wKdxHoNxHoXad407wincR6DcR6GhQ0unS5nVh2nJRlS4sDP6PGr/rqFgR6fE14ez5vw29L9yLcLs+c8RYUYt4jdKKriMN9x572eK/yWw2KLhKd1XLXXhz9/wADT6E87g01JXalpVau78GSxOzpRxN3JVKk+KejVp2u5gtx56wI9DRgbDD0jCw5VJSxIJ03TTkk6fvL9o2F4cssuJZsWz1tOCnx3mG/+yaCTlLNj6P/AMY2P8p/zz+Y/wDGNj/Kf88/me1QoK8X/wAY2P8AKf8APP5nH/pjY/yn/PP5nt0KCvDf+mdj/Kf88/mRf+mtj/K/7z+Z7ckeVt/auR5cOOdri+QXGd/6b2T8r/vL5kH/AKd2T8p/zy+Zs2HtFYukllkacQmpZjyf/H9k/K/7y+Z1f6f2Pnhf95fM0vFZxMD5pacGdhNxdp17j0NtwcOMZKMFFJLLNSbblzjV+PkZNk2jdybyqVqqfCuaM8OfebF58bxZpwT4slKU3LO8STlVX3G3B2+McNQeEm1+9pfHjw4rTyLPtKNVuk9Kt8b9rhVm2WDfYlJb2VLgFjYi4YsjdLtGDb+6StNXfrJutVy0ohPbVJweWKUHdVblrrb4fAKywxsRO94746rnws7HaMVXWLLx5+7zfmzRhbThxgoOCxK/e4PV38ifpuDX7CN1T18Pl4gYli4i4YsiDzPXOze9rwadYC104/H66EfS8PJW6WfKkpXw9WuHmwMDg3+8ztP22ejHbcOvWwszXBOqWnBacNDj2zC/IT0p6/2/wBgSftsisP8A5M9B7XhZ29yqpJK6rjb99kpbZgv/AGF42EedkftsNP22bcPacJRSeDbpK759eHv+B3aNrw5p5cLLbt17/r3hWHK/afkMj9tnoT2vB1rBV6Vfhr46lO2Y8MSScMPJpw+vADM03xmzmV+0y7ExE0lSVfXz8ysIjlftsKDXCbJADnre2xT9tnQBCWHfGTZHcLq/ItAFW4XV+Q3C6vyLQBVuF1+A3C6/AtAFW4XX4DcLq/ItAVVuF1fkNwur8i0AVbhdX5DcLr8C0BFW4XX4DcLr8C0AVbhdfgNwuvwLQBVuF1+A3C6/A9TD2TBez53ifeZZPLnjxTdLLx4alMdli0msWFtK09KdXQGTK8uXM66UFBqLjmdPiq+uiNktjSdPFha5amaSpvnQDDbg04umuDJ4ONKE1OEmpq6kuOvMux9kUIRlni26tKUXVrub7zr2OOv30PMaYpljzeI8Rybm2220nd8bvxOz2mcp55SuVJXS4JUlS7tDmFhKWJGDmoxclFz5JXWbw5l/amywwcXJCTksqeri3F+y8rav5gsZ8bGeI7k9S3YXe0YLb/3If+yNsNjw91ema67+C1v3mLYdNpwv/th/7Irj/F/Jx5bx4z4+6oULOh2co4yRxkV5nbO17rCdOm9EeAsRR4mn/U2Pei/dMOkldcjNdeL1cFp0v3qtMuwu08Obyt5ZdHz8GU7IllhNxpxT17jxNs1jDE9pP9WSX05x7z4nUzHsGJeDDwo0ZjTi+dNGFtTjHLkjJXfrKzOmX4e04azXCLTdpXwXS+JVWen/APxYfH2e6ivB2lwzerGWartWhtO0Yc6yQjCujuyOHjxTi8sXSaa43x11+tALVtzX+3h/ylePj569SMa9lUTxtrwpJqOFGLeiea6M6d8NQNj7RbdvDhxvhr5lONtOdVkiudpa8SjMuqGZdV5gaI7VSSyQdKtVqWenafssO+uX+hjzLqhmXVAXYO0OGb1U1Limiz01/lw/lMuZdRa6gWvHedzSS7louFE57XcXHdwV9EZ8y6oZl1QF+FtTikssXXC1rxsjLGuedRS7lwKsy6oZl1QFzx7d5ESe0r2EZ8y6rzGddUBf6Qq/AvgceMqSyIpzLqvMZl1XmBo9IX5cR6Svy0Z8y6oZl1QEpyt3VHDmZdUMy6oDoOZl1QzLqB0HM66oZl1QHQRc11QzrqgJA5nXVDOuqAWdOZl1XmMy6oDoOZl1QzLqgOg5mXVDMuqAWdOZl1R2M4rjT94HVFvhFvwQyPo/I17LtMIQS30oO22kk1fC/IlHHwZSxN5jSpuOqSealVvTyAxZH7L8iNlqxIdV8P8A8le8jlrS743rXToBwWTwJYeeO8twv1lFq2uissntcZSk3CCTqorRRS4Jc/eBQC1Y8PYj/MRnixfBJe8AsaVVbroW7A/v8H/7If8AsirBxYxdyUZdzbX6DBl68WnqpKvMJJJ8feRxLJZjxZShq47TJJXo9W6b/sWJxuvS5fXvKj18x5va23ShBqDrv5maWJrXpEmqbtWtU+HvV/Aw7dg4dxUtqbi+LXJ8upL8ajy8STlGTk7t8wk4JRnajyly8C70LBcH/wD1puvVVVb79eHz7qM0ZSw5OLedV1zL4mZPGu2Xx6UsdQwko3K078Cc9k3mzUuMdV79TzXtKk9E11s9zYHSrk0vNGOXla3fry+zMSsOujZreJ3lfa2wywJ+qrhiXJV8UY8PF5VWnWzWsYorRnqdi9m7Fi7JOePNLETlmbxMrw0l6jjH962eWWYeFFxbbprw/wAs2jDgSivxxb4cH5osxMTDa9WDi+t3XD+/mXUAKt5hWrw3yvV1yv8Ar5nYzwqejUsujt1dLl5mzB2HEms0UmvFL64DF2GcIuTqlxpp19WBjji4WmbDba6Nq9PnZHeYWn3b4a+s+P1Zvwez8ScVKKVPhqlw/wAEvszF6JdbfDp+oHkSat1ouRw9XE2GcYuTSpd6/QYew4koZ0ll6tpc6A8pMHsLs7Eb5V1vTWvmiuexzTiqXrcOV6J1r4oDzFNrm/MZ31fmet9nYmmi1rmudfNHV2biPlH+Zd3zA8jO+r8xvJdX5nqrs/EcsiScsubiuF1xOrs7Ed0o6cfWX1zQHk531fmM76vzPSxNllGcYOrlXPTV1xLPs+fd3d/fYHk531fmM76vzPTw9inJtJL1W09VxXEkuz8TovNd/wAgPKzvq/MZ31fmev8AZmJ/x7/WWnH5MhHYMRwU6WVq7ckuvyA8yM5WtXx6nN4+r8z1vs7E5qK4/vLlfyK8DZJYibilp1aQHnbx9X5nN5L2n5nsPsvFWrUa15p/ocXZeL7Mf5o/MDyd5L2n5nN5L2n5nqPYZqcYercrrXTTqxi7BOEXKSjS46rS6+YHl55dX5jO+r8z1cHs7ExIqUVGndXJL9fed+zcS1Go23S9ZcQPJzvq/MZ31fmemtim5uKytqnxVancTYJxi5NRpK7TT6fNAeXnfV+Z3eS6vzPVfZ2Ikm1FXVesudfM5PYJxi5NRpK+Kvly94Hl7yXV+ZzO+r8z08LYZzipRSpulbSt6fMYmwzjVpes0lqnq1oB5md9X5jO+r8z1/s3E6R/mXd8yGHsGJJNpLRtO2lquIHlNnD2F2diNXUa65kVR2aTm4KrTp6pLjXPvYHmA9p9l4qVtJKr4pkfs3Eukk+POuF9fADxwerDYpyckkvVdPVHcXYZwVtKrrRp82v6AeSD2MTs7EjxUf5l3/JkcbYZwTcsunRp/oB5IPXwtgnOKcaprnp1/wDyd+zsS6qObpmXf8mB45fCTWHJrRq2vI9KPZmI+Ci64+sutcTDjr1Jfwv9AMP2jjfmP4D7RxvzH8DKANX2ljfmP4EJ7biy4zbKABaton7R30vE9p/ApAFq2ia4SL49q46qsRqvAxgYPQxu29pxIqM8aTS4Wl8jM9txH++/gUAmD2joBRZg4TnJRXFujssGm1zTr3ktlw8zpcW0kWPCoDsuzMRfu/FHJ7DOMW2mlz1X1zLIbNKS0Oz2SaTbTpAY8ne/rkaY9mSavTzIbstwsCUvwvh3gQxOzZRi5OqXeZ3DxNs9jmo2+Hj4FG7QFG7RJJ1WZ10vQt3aG7QFG7Q3aL92N2BzC2eUotJvLa0vTyLPsufd5ncPZ5SVx6rmWehT63XfwAy4+yPDdS+BWoVqmzVjbO4upcSvdoCqUbq23XC2R3SL92hu0BRu0aJ7JJt5pW+9tvg3/RnN2i97HLXu7+4Cp9my7n7/AOxm3aN/oM/pmfdgUbtEo4WZpX3It3Z2GDbSXFgdw9ik7inxdNXxo5idnSim21p0ZdDYpu64+PfQnsc1Ft8Fx18AMO7Q3aL92hu0BU46VbpcFegw8P1llbTvR8C3do7HCtpLmBL7OnK23etW3b/Q5Ps6UVmdV4/2L47FiPh1q7+vpCexTSd8lrqBha0q3S5XocjGrptXxrmXbtDdoCjdo0x2Oc1HW1Tq3wqiO7RdDZJNKuadagQfZc1q68/7GeeDTau6dG30Kd1/Xx+RTLBptPigM7gjm7Rfu0N2gKpRurbdcO4lg4TzLK2m+d0T3ZKGDbSXECMdib4tLxfj3dzO4uwSjFydUu8uWxS+n4/IYuxyirk/iBh3aG7Ro3aG7QGfdor2tvLK228r4u+Rr3aMu2qoy/hf6AeIAAAAAAAAAAAAA9pHTiOgW4Mqss3hDZ55ZKVXTTOylbb6uwJrGfL9S7EhipNyTy9btFeBu6e8zXpWUsxsSDi0p4jfST0Ao3ncdjjNcG14MrLcDaHC6Sd1x7gOb5/TObzuLcTa3KLjS18frkZwJ7wbwgAJ7zuG87iAAsWM1w0943rO4WO4qkk9U9e4ue3yfGMfiBQ8Vvjqc3hLGxnNptJV08SoCe8G8IACe87ju+f0ys1LbpL92Px6V1Aq38ur8yO8ND26XSPx+ZkAnvDqxCs7GVNPo7Anvn9Mb58OXiT9Klrw1d/p8juJtkpRytRqq0QFW87hvO4gAJ7zuG8IHU6YE9++vxLMXa5SUU/3Y5VWmnf1OvbZdEtb+vIi9rllcajTVcAK953DeEABPedx1YzX+Ss0Ye1yilSWia58wK98/pnHily26Sd0vq/mUYk80m3zdgd3ncN4QAE953BYrIEsOeWSa5AS376vzDxm+P6lvpslGkkuOqu9f8jG2yU4uLSSfiBTvO4bzuIACe87jLtjuMv4X+heZ9q/DL+F/oB4wAAAAAAAAAAAAD2jpw6BLDrMs3C9fA5OrdXV6X05FmAouSzaRtW1q65nZxVuuF6eAFcKtXdXr4cy7EjhZW4zk3yTRZhQwmvWbT7lf9DuJhYKi8s23yTjVgYi3Z8l/eN13CkXYccOvWu+7/AEZ7rK8rldaX10/uZjbiQwaeVtvl9V4megKgW0KAqBbSFIBg5K9du7XlzL62f2pfH5EcKOG08zp2vLnyLFDB11f17gM+07u/u22q59Sk1Y8cNNbttrvKqAqFltHaApNP3PWX0v8FdGlQwdbb7vqgK5rAp05Xy+qMpucMGuMvr3GagKiUKzLNwvUnRKCVrNw5gWQWBzcuPfwv5Fc91TyuV0qv3f3NEIYH70pe5d/gRx4YKXqOTff4+HiBiBdQoCmzsatXwvXwLaOxStXw5gWQWBrcpcdOlaf3I4iwdcrlw0vqXRw8HW5Pjpx4eRzEw8GnllK6005+QGGxZdQoCk0YSwqWZyunfjen9SNF+HDCpZm1o7pc+XICD3F8Z118/hwM+JWZ5eF6G5Yez+1Ly/sZppW64cgKLFl1CgKSWHWZZuHMsolhqNrNouYHVuaesm/LrX9Dk91keVyz3onwqy5Qwa/FK/Dy5EZwwsryt5r0Xd5AYxZdQoCmyGP+CX8L/Q00Z9q/DL+F/oB4wAAAAAAAAAAAAD2wABLDrMk3SvV9wm6bp2r0fUngYblJRStt0jsoU2mtVoBHCyu80mtNGW4mFBRbji5muVVzK8q6GjF2fDSuOIn3VqBjtk8FRb9eTR2l0LobJKUVJR0feutAQxIYajccRt1wrw/uUWap7JJRzNKvFFNICuxb6lmVCl0ArtiyzKugpdAO4Si080mna8uZY4YSaW8k11/tRGGA5K0udFnoU/ZXmgM+NlTqEm1195CzRi4Dg6kqZXS6AV2L7yykKXQCvMzTHDwr1xWlpyfvKqXQvjsc22lFWuVoCO7wtfvH8/gZrZrexzX7q80UUugFdksOnJJulerJZV0OxhbSS1YF0MLCrXFad/C/Aji4eGotxxG5dPpHY7JJ8Iri1xXIYmySirlFJeK60BltjMWZV0GVdAK77zsXqrdK9SeVdDqhbSS1YFscPC1vFfH4af3OYmHhJPLitutFT1ZJbFOryquHFHJ7JKKtxVeKAy5n1GZ9SzKugyroBXm7zRhQw2lmxGtHfc+RXlRbDZJSqo8Va1XIDscPB1vFkvcZ8SlJqLtXozU9hmnTir8V9ciiUKdNAVX3i2WZV0FLoBXbJYdOSUnS5sllXQlDDzOktQJxw8LnitceXfp8CGNHDS9Wbk/wBCa2WT5LzXf8mJ7JKKzOKrxXWgM1vqLfUsyroMq6AV2yvH/BL+F/oaMq6FG1fgl/C/0A8YAAAAAAAAAAAAB7YAAnhyrW6LIY2WWZPXXj3leFKpKSV00zjTbt8wNa26l+GHkcxNszJpxgr6Ip2fEcHeVS7n4p/0J4mNmTW6gu9LUCvMiUcdrRSaXiynKyzBm4O8qfiBKWO3xk372RzIsxtoc1WSK0rTxT/oZ8r6AWZkMyK8rGV9AJ50MyIZX0GV9ALY41cJNeB30h+2/NnMLFcFWVPVPXuLvTH+XD66AUyxb4tvx1OZkd2jEeI08qVKtCvK+gEs6O5kV5X0GVgWZkS3745nfiynKzVHaqbe7jqBW9ob4yfmyGdGl7a/y4mPKwJ50dU11K8rJYdxknV0+AFm+ftPzZyWO3xk34tsnHaWrqC+WtnZ7Q5Rcci1rXnpXyApzoZ0QysZX0AnnR1TXUrys7FNNOuDAuW0y9uXmyLx29HJv3sujtbV/dx42QxNolJNZUrVae7694FWdHc6K8r6DK+gFmZEljtaKT82U5WaMLaMqXqJ0mte8CL2hvjN+bIud6tl8tsv/bivDwM2JcpN1Vu6A7mQzIhlYysCeZHViVqnTK8r6EsO4yTq6AnvnwzPzDxnVZnXS2WR2lr9xc/6/MY20uUXHJFX048QKc6GZEMrGVgTzop2l+pL+F/oTysr2hepL+F/oB44AAAAAAAAAAAAD2wcOgW7PiKMlJpOndPmdlNNt6K3dEMKVSTccyTunwfccnq26q3dLgu4DYtshX7OL7yOJtEGmlCK70Rw8aCSTwU6S1vi+85iY0XFpYKi3wafACvMupbg46hdxjK64man0LsPFqNPDT72u8C7E2xSi1kir5mfMizE2hOLSwoq+a4rh8jPlfQCzMhmRXlYpgWZkMyK8rFPoBpwsdRTTinqnqWvb1f7OFVVcjLhypNOCeqdvuLVtEaS3MdPiBzHx1N2ko9yK8yGNPM7UVFdEiun0AszIZl1K6fQZWBZmXU0elx9iPL9P7mPKzVHaIq/uYv68ALHtsfy4GXMi/0qP5EPr3GTKwLMy6ksPESaejopyvoSho02rp8ANUdrir9SL1v43RzE2qMk1kgu9ceXyOQ2mKVbmL1b1rrw4EMXHzJpQjG+iXdzruAhmQzIrysZWBZmR2M0mnxoqysZX0A2w22K/cjxs5La4tNZI6/DQreOvyo8b5dKrh7yMsZNNbuK70tfrQCOZDMivKxlYFmZF+HtUYpepF0nx7zJlZow8ZRSvCTpNa1rr4AXPbY/lw+r+ZnniJtvhb4EntCv9lHhw8+7v+BRPWTajVvguC7gJ5kMy6leV9BlYFmZEsPESaej7inKyWHpJNq65Aa47ZFKt3F+JHF2pSVKMV5EVtCUa3Ub11aT/ocxcdSi0sKMX1VXx8PqwK8yGZFeV9BlfQCzMijan6kv4X+hPK+hXjr1Jfwv9APHAAAAAAAAAAAAAe0dAAu2eajK2lJdH4E8bEUpNpKKfJcinCklJNrMugxWnJuKyp8F0A2ekYXPDXnRHExsNppYdPk8z0M2BKMZXKOZdCzExMNxqOG0+TsCNo0YO0wjGnhxk+tmGi+GJFJepb5t+OgF+JtOG4tLDjF9UZrLJ40HClhJS9ozUBbYsqoAW2LKqAGrCxYxWsU9UXy2zDa/YxXv/sY8GcUvWhm1Rf6RhfkgQxsWMncYqK6X3ldndpnGUrhDIq4FNAW2LKqFAW2aVtMF/tx+HTwMNGlYuHr918eGj+vcBb6VC/2ca6Uu/nXh5Gay54+F+V+hkoC2yUJpNN6roUUSjo02rV8ANkNpgrvDT+vA5PaIOLSw0m+emnwKd7HX7tavT4f38yU8aDg0sOpPg+gFdiyqhQFtnYySafHXgU0djo02rV8ANkdpgn+zi1d613d3d8Q9ohla3cbfB9CEcbDV/dXrfHw0OTxsNxaWFTriBCzllVCgLbNGHtEElcE6T6amKjRh4sElmw7aTvv14gaHteHT+6j8PkZpyTbeit8OhZ6Rh0/ufB3wM03bbqrfBcu4CdoWiqhQFtonhzSkm6a6GeiWG0pJtWugG2O04aX7NPy6+BDEx4NSSgk3VPoV76CWmGvfr17u9eR3FxoOLUcPK70fTuArsWVUKAtsp2r8Ev4X+hKiWVbrGtcIP9GB4IAAAAAAAAAAAAD2wcOgXbPFN03Xf7ieNBRk0nmS5lOEm3STb6IlNuLakmmuQFmHhuT9VWacfDiotrDxIvvVJGbDeJFZoqST5oliY+LlqWfLztP4sCo0+jw/MXw+Zj3hPDTl+GLfgBfLBik2sRNrl14cPP4FBKWFNK3Fpd5VvAJghnGcCYIZxnA0YWEpLWSWq4lz2WFP76Pl/cyRTktIt60TeDP2GB3Fgo1Urtf1Kzk7i6kmn3kd4BMEN4N4BM1Q2WL44kVounzMW8LVhT9hgXzwIJv7xOui7vEzE9zP2X8CneATJQjbSbq+ZVvDsZW6S1YGuGzxfHES1rl148RibPBRbWKm+iKI4U3wgzk4SircWkBwEM4zgTOxVtK67yvOdUr0S1A2x2WLX7WPHhp8yt4EcraxF4e7xKo4U3ooPjXvEsKaVuDSAiCG8GcCZow8CLSbxErT6afEyZyyMJOqi3eqA0+iw/Nj8P6sz4iqTSdpPiN1P2JFcpU6apoCQIbwbwCZLDinJJul1Kt4djK3SVsDVuI1riK/8nMbBhFWp5u5V8ypYU/Zf1/gTw5xVuLS6gRBDeDOBMo2r8Mv4X+hZnKtpdwl/C/0A8YAAAAAAAAAAAAB7R0WLA7GTTtOmJSbdt2zl9wsCyGPOPCTR2W0zccrk3HoVWLA4ThNxdxdPuIoWBOeNKSqUm0QFiwAFi+4DgO2LAnDFlHg6JekT9p9CqxYHZTbq23WmpwWL7gAFiwCi3wNOHLFkri29a463xM1nVNrg68GwLMKWJKXqtt8frzJPYsT2enNcyhSrhoS30vafmwLPRMT2fihDZ5/iS4Pja0d1+pXvZdX5s4sRrg2vewL8SWLhunJpvXiVTxpSVOTaIuV8dfM5YACxYAJ0LFgW+kT9piW0zaacm0+NlViwBw7YsAWRx5pJKTVFdiwLfScT25eZXJtu3q2csWAOHbFgcJQk07XE5YsCz0iftP6/wAiePOSpybXQrsWAAFgWYeDKSk4q1FW+4px4PdSlWlNX30TUuPfx4leO/Ul4P8AQJN3144ACgAAAAAAAAAA+he0XiTm4p5r05K2dltEWq3UV4FO07Fi4ONusZ5HV3yarRp/ApxllrLiZrvlVUxu+pmNq2pJprChpyrR8OPl8Tvpi/Kh7keZnfUZ31A9DDx4x/2oy/ibKb1Mud9Tu8fUDVdu6S8CJn3j6jO+oGgGfO+ozvqBoBn3j6jePqBoBn3j6lmzxliTjBOnKSSb4avmBYDR2h2ZibPFSliQknLL6rbfPu7ijZdmlipvPGKTSTlerfLRMtln1OPKcpscBXGE3PJ+9eWu+6Z6PbHYmLskITniQmpyyrI3o6vW0iNMQL9l7K2jGwZY0F6kdL8Cns/ZcTaMRYeG1mab1dLQXybTHAWdodn4+zOKxUlmuqafCr/Us2fsvExMPPGcc2VyUL1aXMnGzlNnxOVnD+zOCzszYMXapuOE1cY5m5SUUlwv4nO0tgxdlxFDFatxUllaaabaT+DKuIAslsGMsHffuaeOvDuK9k2fExpNR5atvgl3i+fTjO3kAR2rCnhTcJvVdHaL8fYMTDw885JNVcXebX3d5eMvL4nL8fKqB3ZNlxMW3F0lVt8FfAhi4M44u6TzSzKKrm3VV5mdTtNz9pA29qdh4+y4axJzhKLaTyt6N+K1WnEz7HsOJjfg8NXWvQqz34qBxYM3iLCX482Wn1NHaXZmLsyi5yTUua6lk2ab+lAJ7NsWLiQc4tJa1bpulenUyub7yC8GfO+pzO+oXGkGbO+o3j6k1ro0gzKcuTZKp5stPNdVzvpQ06LwVbRh4mFLLiRlCXGmq06nMRTi6b1GnVcTw8TK7pPxKp7NjRlGLi7lwVo4sDFbkqfqVmtpVbpXY1OrT6RreSH8vH6ok9q0/BDyPPlKSbTtNaNc7XI0bpV+3jxS9zq3x5W/5TXZOrrdts5tWJmw3olUGtFx0595OGxynNQwcSOI2rdOqWnG+HEy7Zhzw3KE9HXW001o0yL1eYAAgAAAAAAAAAAPUxe1ZYmIsTFm8SSVevbVdDj7Qg01kgrXKL8zzAB6a2/D/Lh5MoePG+PwZjAGzfx6/Bjfx6/BmMAbN/Hr8GN/Hr8GYwBs38evwY38evwZjAGzfx6/Bjfx6/BmMAbN/Hr8GTwdsUJxmmri01adWjAAPY2ztp40cs8qV36sWtfpleB2m8OLjCSim7bypu601a5f1PLBbbfrPHjOMyN+HtijJSUtVrqmW7V2rPGpYmJKSWqTuk+tHlgjT18HtmcMN4cMRqD4pWV7J2k8GefDlUqaunwfE8wC+zKPX23tZ42VTm5KN5W071q78itdoLJWZp1Wif4elnmATz4lkv17PZnbUtlnKeFlblHK1OLaq0+TXQ52p21La8RYmLlTUVFKEWlSbfO+rPHAV6su1pPCWE5+qlXDWulleBtqhK1JpPjpfwPOAvv04/j7G/F2tTk5OVt9xPE27MtZyk3xtP3anmgstnwvt2vSwNuULWdpPikuPTQg9sWbPmqVp2r0a4MwAzidZuvd2/t+ePhxw5yjlTzNRhluWur68e7jwOdn9sLBupyj0qClr11aPDAyLffr0vT6msRSeZO7rn1L+0O2pbRlU5aR5KLWvxPGBZcmRM/b1tn7VeHCUYzaT5Vfc9eRl38ev6mMBWzfx6/A5vo9fgZAFlxr30eo30evwMgJi9q2LHj1/U7DalFqUZNNVTVpquFGIDDtXobRt8sWWbFxJTlVXJuTrpb8WQxNrzu5SbdVrZiAw7N3prtSzvNHRPW1XQlHb2m2pu5VfHWuFnngYdm/B21Qblo20/xJvjz8TR9r63lw+FfhfWzyAMTXq/aslNThJQklXqp6rvu7M+0bXvHKUpXJrp3aIxApoAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "108151d8-743e-49fb-ac0c-c81680a998bc",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives \n",
+ "- Import and Prepare Data \n",
+ " - Import necessary Python libraries. \n",
+ " - Load and separate te 2021 training dataset into features and labels. \n",
+ "- Create and Train a Decision Tree Model\n",
+ " - Initalize and train a Decision Tree model using the 2021 training data. \n",
+ "- Visualize and Interpret the Decision Tree\n",
+ " - Generate and interpret visual representations of the trained Decision Tree model. \n",
+ "- Make Predictions and Evaluate Accuracy\n",
+ " - Use the trained model to make predictions on 2021 testing data.\n",
+ " - Compare predicted values with actual values to assess model accuracy. \n",
+ "- Calculate and Compare RMSE \n",
+ " - Calculate the Root Mean Square Error (RMSE) for the 2021 model. \n",
+ " - Compare the RMSE of the 2021 model with the 2020 model to evaluate improvements. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9d8356f9-d9e2-44e5-8463-f140b038611a",
+ "metadata": {},
+ "source": [
+ "# Prerequisites \n",
+ "***Modules*** \n",
+ "\n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "735d2e88-9015-4711-ab57-87fa6b2ea661",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "Copy-paste the required lines of code from ***Introduction to Machine Learning: Decision Trees*** for each section below. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b1e1a46d-6e6c-4386-a0ca-0fcb19009e26",
+ "metadata": {},
+ "source": [
+ "## **1) Repeat Step 1 (Importing Necessary Packages)**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ce366147-fc2c-416d-8f28-325bacbb28e3",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "54218111-757c-4b75-80aa-2754ffd916df",
+ "metadata": {},
+ "source": [
+ "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
+ "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "5812834c-28ec-492e-bebf-0e5cc34d4270",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d98ca761-0f7c-4f5e-acf4-05331b993b22",
+ "metadata": {},
+ "source": [
+ "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
+ "SKIP:\n",
+ "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "966fee45-4f0d-4f98-8819-a9543a6ab0bc",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "26d25871-caa2-41b5-b800-6224932ec759",
+ "metadata": {},
+ "source": [
+ "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "14883b4f-d547-4de2-b15e-1602f1c279ae",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d73e8261-7466-4845-b847-1088e2610828",
+ "metadata": {},
+ "source": [
+ "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b4eb80ad-bc43-4405-9dd4-c671933d6597",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f7c646d7-46c5-475c-a446-5decacb0f6df",
+ "metadata": {},
+ "source": [
+ "## **6) Repeat step 6A, 6B, 6C (Load Testing Data and make your Predictions)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8dcb9a83-f172-4194-b2aa-0edcc534a191",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fbefce65-467c-46d0-abde-d02662c1d071",
+ "metadata": {},
+ "source": [
+ "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c3c6d248-22da-4ddf-b270-bdb5092863fd",
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5b90d2e6-520b-4dd7-810d-85ae7586b009",
+ "metadata": {},
+ "source": [
+ "## **8) Extra: (Calculate RMSE and create Aggregate error histograms)** \n",
+ "\n",
+ "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2b5c7f98-e48e-4a7e-a1dc-954a528bbec0",
+ "metadata": {
+ "jp-MarkdownHeadingCollapsed": true,
+ "tags": []
+ },
+ "source": [
+ "# Conculsion \n",
+ "In this practice, you successfully: \n",
+ "\n",
+ "1. **Imported and Prepared Data:** Loaded the 2021 training dataset and separated it into features and labels. \n",
+ "2. **Created and Trained a Decision Tree Model:** Initalized and trained a decision tree model using the 2021 data.\n",
+ "3. **Visualized and Interpreted the Decision Tree:** Generated and interpreted visual represesntation of the trained model. \n",
+ "4. **Made Predictions and Evaluated Accuracy:** Predicted outcomes using the 2021 testing data and assessed model accuracy. \n",
+ "5. **Calculated and Compareted RMSE:** Calculated the RMSE for the 2021 model and compared it with the 2020 model.\n",
+ "\n",
+ "By completing this module, you have reinforced your understanding of decision trees and gained practical experience in adapting machine learning models to new data. This practice not only enhances your technical skills but also prepares you for real-world applications where models need to be continuously updated and evaluated. Keep exploring and refining your models to achieve even better predictions and insights! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "12602949-b5ef-4ba2-89ff-a5617507e366",
+ "metadata": {},
+ "source": [
+ "# Clean up\n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges."
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m108",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/Google Cloud/4- Practice - Answer Key.ipynb b/Google Cloud/4- Practice - Answer Key.ipynb
new file mode 100644
index 0000000..716b722
--- /dev/null
+++ b/Google Cloud/4- Practice - Answer Key.ipynb
@@ -0,0 +1,498 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "328b4b3b-e1b3-4490-8f76-0241ed6c3b5c",
+ "metadata": {},
+ "source": [
+ "# **Practice Answer Key: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!**\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d096f58c-168f-45bf-86f6-c8430077e862",
+ "metadata": {},
+ "source": [
+ "# Overview \n",
+ "This module contains the answer key to ***Practice: Let's make a NEW Decision Tree for Summer 2021 and improve our predictions!***.\n",
+ "\n",
+ "In order to expedite the making of the NEW Decision Tree, we can skip a few steps, and only copy-paste the required lines of code.\n",
+ "\n",
+ "* You DON'T need to copy-paste the comments from the original code (the green text that is preceded by \"#\"). \n",
+ "* Follow instead the instructions written as a comment in this following exercise to create a NEW Decision Tree for Summer 2021 data.\n",
+ "\n",
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "72283446-4a79-480f-8ec7-d72dcf6f7a83",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBoYFhsaGBoeHRsfIiomIyIhIiolJigoMCkyMC0oLS01PVBCNThLOS0tRWFFS1NWW1xbMkJlbWRYbFBZW1cBERISGRYYLhsbLVc/NT1XV1dXV1dXV1dXV2NXV1dXV1dXV1dXV1dXV1dXV1dXV1dXV11XXVdXV1deXVddV1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAgMEAQUGB//EAEAQAAICAAMFBAcFBwQCAwEAAAABAhEDEiEEEzFBUQVhcZEUIlKBodHwFSMyU7EzQnKSweHxBkNiohayY3PSJP/EABcBAQEBAQAAAAAAAAAAAAAAAAABAgP/xAAfEQEBAQADAQACAwAAAAAAAAAAARECEiExIkEDMmH/2gAMAwEAAhEDEQA/APz8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHtYfZMZZqy+rxuT+BN9iKr9Tg3+N8iMYt8FZLCwnJ1FfFLnXFgRxOx1Gry6utJPz8Dq7FtJqMdVf4tfAji+paejTpoi27rI76VqEWy7Eq7jGkm/x9PeRl2OlVpK+HrHHCaV7uVacuvA5Uvy5eTCox7Ni1N5V6it+t+nUv7I7A9LxJwhKMMqu5X1Kmp6fdy1dLR8UIRxM3qRxIt+zcScpbPCIfZDvESyvJNxfrU20+KT5HfsWfJRfhNfXMjuZSf4Jt+DbvyI7r/jLQsCPZjbrS7a/FwaaWr95P7Hl7K/mRBYd/usLD/wCLCJR7JbWijxaazrkc+ynTdR0bT9bXQ48L/iwsPnTCq47EnCU6VRaT1116dTX2b2J6S5qMoxyK3mdX4FO6fsv4HYxktUpL30WZvqX/AByHZbcM6y1dfi14tcPd8UTfYsk69T3TRDdf8X8BuX7LIK8fYd20pLirVO9CrcR6GlYT9mRxw7mBn3Eehds3Z+9bUa0TerrgT3Tq8rORhfCMgqv0JZM9LLmy8dbq+BLA7P3jko16sXJ3JR0XiyW7/wCMju7fsyAzbiPQbiPQ0ShXFNEdO8Ip3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvAp3Eeg3Eehdp3jTvCqdxHoNxHoXad407wKdxHoNxHoXad407wincR6DcR6GhQ0unS5nVh2nJRlS4sDP6PGr/rqFgR6fE14ez5vw29L9yLcLs+c8RYUYt4jdKKriMN9x572eK/yWw2KLhKd1XLXXhz9/wADT6E87g01JXalpVau78GSxOzpRxN3JVKk+KejVp2u5gtx56wI9DRgbDD0jCw5VJSxIJ03TTkk6fvL9o2F4cssuJZsWz1tOCnx3mG/+yaCTlLNj6P/AMY2P8p/zz+Y/wDGNj/Kf88/me1QoK8X/wAY2P8AKf8APP5nH/pjY/yn/PP5nt0KCvDf+mdj/Kf88/mRf+mtj/K/7z+Z7ckeVt/auR5cOOdri+QXGd/6b2T8r/vL5kH/AKd2T8p/zy+Zs2HtFYukllkacQmpZjyf/H9k/K/7y+Z1f6f2Pnhf95fM0vFZxMD5pacGdhNxdp17j0NtwcOMZKMFFJLLNSbblzjV+PkZNk2jdybyqVqqfCuaM8OfebF58bxZpwT4slKU3LO8STlVX3G3B2+McNQeEm1+9pfHjw4rTyLPtKNVuk9Kt8b9rhVm2WDfYlJb2VLgFjYi4YsjdLtGDb+6StNXfrJutVy0ohPbVJweWKUHdVblrrb4fAKywxsRO94746rnws7HaMVXWLLx5+7zfmzRhbThxgoOCxK/e4PV38ifpuDX7CN1T18Pl4gYli4i4YsiDzPXOze9rwadYC104/H66EfS8PJW6WfKkpXw9WuHmwMDg3+8ztP22ejHbcOvWwszXBOqWnBacNDj2zC/IT0p6/2/wBgSftsisP8A5M9B7XhZ29yqpJK6rjb99kpbZgv/AGF42EedkftsNP22bcPacJRSeDbpK759eHv+B3aNrw5p5cLLbt17/r3hWHK/afkMj9tnoT2vB1rBV6Vfhr46lO2Y8MSScMPJpw+vADM03xmzmV+0y7ExE0lSVfXz8ysIjlftsKDXCbJADnre2xT9tnQBCWHfGTZHcLq/ItAFW4XV+Q3C6vyLQBVuF1+A3C6/AtAFW4XX4DcLq/ItAVVuF1fkNwur8i0AVbhdX5DcLr8C0BFW4XX4DcLr8C0AVbhdfgNwuvwLQBVuF1+A3C6/A9TD2TBez53ifeZZPLnjxTdLLx4alMdli0msWFtK09KdXQGTK8uXM66UFBqLjmdPiq+uiNktjSdPFha5amaSpvnQDDbg04umuDJ4ONKE1OEmpq6kuOvMux9kUIRlni26tKUXVrub7zr2OOv30PMaYpljzeI8Rybm2220nd8bvxOz2mcp55SuVJXS4JUlS7tDmFhKWJGDmoxclFz5JXWbw5l/amywwcXJCTksqeri3F+y8rav5gsZ8bGeI7k9S3YXe0YLb/3If+yNsNjw91ema67+C1v3mLYdNpwv/th/7Irj/F/Jx5bx4z4+6oULOh2co4yRxkV5nbO17rCdOm9EeAsRR4mn/U2Pei/dMOkldcjNdeL1cFp0v3qtMuwu08Obyt5ZdHz8GU7IllhNxpxT17jxNs1jDE9pP9WSX05x7z4nUzHsGJeDDwo0ZjTi+dNGFtTjHLkjJXfrKzOmX4e04azXCLTdpXwXS+JVWen/APxYfH2e6ivB2lwzerGWartWhtO0Yc6yQjCujuyOHjxTi8sXSaa43x11+tALVtzX+3h/ylePj569SMa9lUTxtrwpJqOFGLeiea6M6d8NQNj7RbdvDhxvhr5lONtOdVkiudpa8SjMuqGZdV5gaI7VSSyQdKtVqWenafssO+uX+hjzLqhmXVAXYO0OGb1U1Limiz01/lw/lMuZdRa6gWvHedzSS7louFE57XcXHdwV9EZ8y6oZl1QF+FtTikssXXC1rxsjLGuedRS7lwKsy6oZl1QFzx7d5ESe0r2EZ8y6rzGddUBf6Qq/AvgceMqSyIpzLqvMZl1XmBo9IX5cR6Svy0Z8y6oZl1QEpyt3VHDmZdUMy6oDoOZl1QzLqB0HM66oZl1QHQRc11QzrqgJA5nXVDOuqAWdOZl1XmMy6oDoOZl1QzLqgOg5mXVDMuqAWdOZl1R2M4rjT94HVFvhFvwQyPo/I17LtMIQS30oO22kk1fC/IlHHwZSxN5jSpuOqSealVvTyAxZH7L8iNlqxIdV8P8A8le8jlrS743rXToBwWTwJYeeO8twv1lFq2uissntcZSk3CCTqorRRS4Jc/eBQC1Y8PYj/MRnixfBJe8AsaVVbroW7A/v8H/7If8AsirBxYxdyUZdzbX6DBl68WnqpKvMJJJ8feRxLJZjxZShq47TJJXo9W6b/sWJxuvS5fXvKj18x5va23ShBqDrv5maWJrXpEmqbtWtU+HvV/Aw7dg4dxUtqbi+LXJ8upL8ajy8STlGTk7t8wk4JRnajyly8C70LBcH/wD1puvVVVb79eHz7qM0ZSw5OLedV1zL4mZPGu2Xx6UsdQwko3K078Cc9k3mzUuMdV79TzXtKk9E11s9zYHSrk0vNGOXla3fry+zMSsOujZreJ3lfa2wywJ+qrhiXJV8UY8PF5VWnWzWsYorRnqdi9m7Fi7JOePNLETlmbxMrw0l6jjH962eWWYeFFxbbprw/wAs2jDgSivxxb4cH5osxMTDa9WDi+t3XD+/mXUAKt5hWrw3yvV1yv8Ar5nYzwqejUsujt1dLl5mzB2HEms0UmvFL64DF2GcIuTqlxpp19WBjji4WmbDba6Nq9PnZHeYWn3b4a+s+P1Zvwez8ScVKKVPhqlw/wAEvszF6JdbfDp+oHkSat1ouRw9XE2GcYuTSpd6/QYew4koZ0ll6tpc6A8pMHsLs7Eb5V1vTWvmiuexzTiqXrcOV6J1r4oDzFNrm/MZ31fmet9nYmmi1rmudfNHV2biPlH+Zd3zA8jO+r8xvJdX5nqrs/EcsiScsubiuF1xOrs7Ed0o6cfWX1zQHk531fmM76vzPSxNllGcYOrlXPTV1xLPs+fd3d/fYHk531fmM76vzPTw9inJtJL1W09VxXEkuz8TovNd/wAgPKzvq/MZ31fmev8AZmJ/x7/WWnH5MhHYMRwU6WVq7ckuvyA8yM5WtXx6nN4+r8z1vs7E5qK4/vLlfyK8DZJYibilp1aQHnbx9X5nN5L2n5nsPsvFWrUa15p/ocXZeL7Mf5o/MDyd5L2n5nN5L2n5nqPYZqcYercrrXTTqxi7BOEXKSjS46rS6+YHl55dX5jO+r8z1cHs7ExIqUVGndXJL9fed+zcS1Go23S9ZcQPJzvq/MZ31fmemtim5uKytqnxVancTYJxi5NRpK7TT6fNAeXnfV+Z3eS6vzPVfZ2Ikm1FXVesudfM5PYJxi5NRpK+Kvly94Hl7yXV+ZzO+r8z08LYZzipRSpulbSt6fMYmwzjVpes0lqnq1oB5md9X5jO+r8z1/s3E6R/mXd8yGHsGJJNpLRtO2lquIHlNnD2F2diNXUa65kVR2aTm4KrTp6pLjXPvYHmA9p9l4qVtJKr4pkfs3Eukk+POuF9fADxwerDYpyckkvVdPVHcXYZwVtKrrRp82v6AeSD2MTs7EjxUf5l3/JkcbYZwTcsunRp/oB5IPXwtgnOKcaprnp1/wDyd+zsS6qObpmXf8mB45fCTWHJrRq2vI9KPZmI+Ci64+sutcTDjr1Jfwv9AMP2jjfmP4D7RxvzH8DKANX2ljfmP4EJ7biy4zbKABaton7R30vE9p/ApAFq2ia4SL49q46qsRqvAxgYPQxu29pxIqM8aTS4Wl8jM9txH++/gUAmD2joBRZg4TnJRXFujssGm1zTr3ktlw8zpcW0kWPCoDsuzMRfu/FHJ7DOMW2mlz1X1zLIbNKS0Oz2SaTbTpAY8ne/rkaY9mSavTzIbstwsCUvwvh3gQxOzZRi5OqXeZ3DxNs9jmo2+Hj4FG7QFG7RJJ1WZ10vQt3aG7QFG7Q3aL92N2BzC2eUotJvLa0vTyLPsufd5ncPZ5SVx6rmWehT63XfwAy4+yPDdS+BWoVqmzVjbO4upcSvdoCqUbq23XC2R3SL92hu0BRu0aJ7JJt5pW+9tvg3/RnN2i97HLXu7+4Cp9my7n7/AOxm3aN/oM/pmfdgUbtEo4WZpX3It3Z2GDbSXFgdw9ik7inxdNXxo5idnSim21p0ZdDYpu64+PfQnsc1Ft8Fx18AMO7Q3aL92hu0BU46VbpcFegw8P1llbTvR8C3do7HCtpLmBL7OnK23etW3b/Q5Ps6UVmdV4/2L47FiPh1q7+vpCexTSd8lrqBha0q3S5XocjGrptXxrmXbtDdoCjdo0x2Oc1HW1Tq3wqiO7RdDZJNKuadagQfZc1q68/7GeeDTau6dG30Kd1/Xx+RTLBptPigM7gjm7Rfu0N2gKpRurbdcO4lg4TzLK2m+d0T3ZKGDbSXECMdib4tLxfj3dzO4uwSjFydUu8uWxS+n4/IYuxyirk/iBh3aG7Ro3aG7QGfdor2tvLK228r4u+Rr3aMu2qoy/hf6AeIAAAAAAAAAAAAA9pHTiOgW4Mqss3hDZ55ZKVXTTOylbb6uwJrGfL9S7EhipNyTy9btFeBu6e8zXpWUsxsSDi0p4jfST0Ao3ncdjjNcG14MrLcDaHC6Sd1x7gOb5/TObzuLcTa3KLjS18frkZwJ7wbwgAJ7zuG87iAAsWM1w0943rO4WO4qkk9U9e4ue3yfGMfiBQ8Vvjqc3hLGxnNptJV08SoCe8G8IACe87ju+f0ys1LbpL92Px6V1Aq38ur8yO8ND26XSPx+ZkAnvDqxCs7GVNPo7Anvn9Mb58OXiT9Klrw1d/p8juJtkpRytRqq0QFW87hvO4gAJ7zuG8IHU6YE9++vxLMXa5SUU/3Y5VWmnf1OvbZdEtb+vIi9rllcajTVcAK953DeEABPedx1YzX+Ss0Ye1yilSWia58wK98/pnHily26Sd0vq/mUYk80m3zdgd3ncN4QAE953BYrIEsOeWSa5AS376vzDxm+P6lvpslGkkuOqu9f8jG2yU4uLSSfiBTvO4bzuIACe87jLtjuMv4X+heZ9q/DL+F/oB4wAAAAAAAAAAAAD2jpw6BLDrMs3C9fA5OrdXV6X05FmAouSzaRtW1q65nZxVuuF6eAFcKtXdXr4cy7EjhZW4zk3yTRZhQwmvWbT7lf9DuJhYKi8s23yTjVgYi3Z8l/eN13CkXYccOvWu+7/AEZ7rK8rldaX10/uZjbiQwaeVtvl9V4megKgW0KAqBbSFIBg5K9du7XlzL62f2pfH5EcKOG08zp2vLnyLFDB11f17gM+07u/u22q59Sk1Y8cNNbttrvKqAqFltHaApNP3PWX0v8FdGlQwdbb7vqgK5rAp05Xy+qMpucMGuMvr3GagKiUKzLNwvUnRKCVrNw5gWQWBzcuPfwv5Fc91TyuV0qv3f3NEIYH70pe5d/gRx4YKXqOTff4+HiBiBdQoCmzsatXwvXwLaOxStXw5gWQWBrcpcdOlaf3I4iwdcrlw0vqXRw8HW5Pjpx4eRzEw8GnllK6005+QGGxZdQoCk0YSwqWZyunfjen9SNF+HDCpZm1o7pc+XICD3F8Z118/hwM+JWZ5eF6G5Yez+1Ly/sZppW64cgKLFl1CgKSWHWZZuHMsolhqNrNouYHVuaesm/LrX9Dk91keVyz3onwqy5Qwa/FK/Dy5EZwwsryt5r0Xd5AYxZdQoCmyGP+CX8L/Q00Z9q/DL+F/oB4wAAAAAAAAAAAAD2wABLDrMk3SvV9wm6bp2r0fUngYblJRStt0jsoU2mtVoBHCyu80mtNGW4mFBRbji5muVVzK8q6GjF2fDSuOIn3VqBjtk8FRb9eTR2l0LobJKUVJR0feutAQxIYajccRt1wrw/uUWap7JJRzNKvFFNICuxb6lmVCl0ArtiyzKugpdAO4Si080mna8uZY4YSaW8k11/tRGGA5K0udFnoU/ZXmgM+NlTqEm1195CzRi4Dg6kqZXS6AV2L7yykKXQCvMzTHDwr1xWlpyfvKqXQvjsc22lFWuVoCO7wtfvH8/gZrZrexzX7q80UUugFdksOnJJulerJZV0OxhbSS1YF0MLCrXFad/C/Aji4eGotxxG5dPpHY7JJ8Iri1xXIYmySirlFJeK60BltjMWZV0GVdAK77zsXqrdK9SeVdDqhbSS1YFscPC1vFfH4af3OYmHhJPLitutFT1ZJbFOryquHFHJ7JKKtxVeKAy5n1GZ9SzKugyroBXm7zRhQw2lmxGtHfc+RXlRbDZJSqo8Va1XIDscPB1vFkvcZ8SlJqLtXozU9hmnTir8V9ciiUKdNAVX3i2WZV0FLoBXbJYdOSUnS5sllXQlDDzOktQJxw8LnitceXfp8CGNHDS9Wbk/wBCa2WT5LzXf8mJ7JKKzOKrxXWgM1vqLfUsyroMq6AV2yvH/BL+F/oaMq6FG1fgl/C/0A8YAAAAAAAAAAAAB7YAAnhyrW6LIY2WWZPXXj3leFKpKSV00zjTbt8wNa26l+GHkcxNszJpxgr6Ip2fEcHeVS7n4p/0J4mNmTW6gu9LUCvMiUcdrRSaXiynKyzBm4O8qfiBKWO3xk372RzIsxtoc1WSK0rTxT/oZ8r6AWZkMyK8rGV9AJ50MyIZX0GV9ALY41cJNeB30h+2/NnMLFcFWVPVPXuLvTH+XD66AUyxb4tvx1OZkd2jEeI08qVKtCvK+gEs6O5kV5X0GVgWZkS3745nfiynKzVHaqbe7jqBW9ob4yfmyGdGl7a/y4mPKwJ50dU11K8rJYdxknV0+AFm+ftPzZyWO3xk34tsnHaWrqC+WtnZ7Q5Rcci1rXnpXyApzoZ0QysZX0AnnR1TXUrys7FNNOuDAuW0y9uXmyLx29HJv3sujtbV/dx42QxNolJNZUrVae7694FWdHc6K8r6DK+gFmZEljtaKT82U5WaMLaMqXqJ0mte8CL2hvjN+bIud6tl8tsv/bivDwM2JcpN1Vu6A7mQzIhlYysCeZHViVqnTK8r6EsO4yTq6AnvnwzPzDxnVZnXS2WR2lr9xc/6/MY20uUXHJFX048QKc6GZEMrGVgTzop2l+pL+F/oTysr2hepL+F/oB44AAAAAAAAAAAAD2wcOgW7PiKMlJpOndPmdlNNt6K3dEMKVSTccyTunwfccnq26q3dLgu4DYtshX7OL7yOJtEGmlCK70Rw8aCSTwU6S1vi+85iY0XFpYKi3wafACvMupbg46hdxjK64man0LsPFqNPDT72u8C7E2xSi1kir5mfMizE2hOLSwoq+a4rh8jPlfQCzMhmRXlYpgWZkMyK8rFPoBpwsdRTTinqnqWvb1f7OFVVcjLhypNOCeqdvuLVtEaS3MdPiBzHx1N2ko9yK8yGNPM7UVFdEiun0AszIZl1K6fQZWBZmXU0elx9iPL9P7mPKzVHaIq/uYv68ALHtsfy4GXMi/0qP5EPr3GTKwLMy6ksPESaejopyvoSho02rp8ANUdrir9SL1v43RzE2qMk1kgu9ceXyOQ2mKVbmL1b1rrw4EMXHzJpQjG+iXdzruAhmQzIrysZWBZmR2M0mnxoqysZX0A2w22K/cjxs5La4tNZI6/DQreOvyo8b5dKrh7yMsZNNbuK70tfrQCOZDMivKxlYFmZF+HtUYpepF0nx7zJlZow8ZRSvCTpNa1rr4AXPbY/lw+r+ZnniJtvhb4EntCv9lHhw8+7v+BRPWTajVvguC7gJ5kMy6leV9BlYFmZEsPESaej7inKyWHpJNq65Aa47ZFKt3F+JHF2pSVKMV5EVtCUa3Ub11aT/ocxcdSi0sKMX1VXx8PqwK8yGZFeV9BlfQCzMijan6kv4X+hPK+hXjr1Jfwv9APHAAAAAAAAAAAAAe0dAAu2eajK2lJdH4E8bEUpNpKKfJcinCklJNrMugxWnJuKyp8F0A2ekYXPDXnRHExsNppYdPk8z0M2BKMZXKOZdCzExMNxqOG0+TsCNo0YO0wjGnhxk+tmGi+GJFJepb5t+OgF+JtOG4tLDjF9UZrLJ40HClhJS9ozUBbYsqoAW2LKqAGrCxYxWsU9UXy2zDa/YxXv/sY8GcUvWhm1Rf6RhfkgQxsWMncYqK6X3ldndpnGUrhDIq4FNAW2LKqFAW2aVtMF/tx+HTwMNGlYuHr918eGj+vcBb6VC/2ca6Uu/nXh5Gay54+F+V+hkoC2yUJpNN6roUUSjo02rV8ANkNpgrvDT+vA5PaIOLSw0m+emnwKd7HX7tavT4f38yU8aDg0sOpPg+gFdiyqhQFtnYySafHXgU0djo02rV8ANkdpgn+zi1d613d3d8Q9ohla3cbfB9CEcbDV/dXrfHw0OTxsNxaWFTriBCzllVCgLbNGHtEElcE6T6amKjRh4sElmw7aTvv14gaHteHT+6j8PkZpyTbeit8OhZ6Rh0/ufB3wM03bbqrfBcu4CdoWiqhQFtonhzSkm6a6GeiWG0pJtWugG2O04aX7NPy6+BDEx4NSSgk3VPoV76CWmGvfr17u9eR3FxoOLUcPK70fTuArsWVUKAtsp2r8Ev4X+hKiWVbrGtcIP9GB4IAAAAAAAAAAAAD2wcOgXbPFN03Xf7ieNBRk0nmS5lOEm3STb6IlNuLakmmuQFmHhuT9VWacfDiotrDxIvvVJGbDeJFZoqST5oliY+LlqWfLztP4sCo0+jw/MXw+Zj3hPDTl+GLfgBfLBik2sRNrl14cPP4FBKWFNK3Fpd5VvAJghnGcCYIZxnA0YWEpLWSWq4lz2WFP76Pl/cyRTktIt60TeDP2GB3Fgo1Urtf1Kzk7i6kmn3kd4BMEN4N4BM1Q2WL44kVounzMW8LVhT9hgXzwIJv7xOui7vEzE9zP2X8CneATJQjbSbq+ZVvDsZW6S1YGuGzxfHES1rl148RibPBRbWKm+iKI4U3wgzk4SircWkBwEM4zgTOxVtK67yvOdUr0S1A2x2WLX7WPHhp8yt4EcraxF4e7xKo4U3ooPjXvEsKaVuDSAiCG8GcCZow8CLSbxErT6afEyZyyMJOqi3eqA0+iw/Nj8P6sz4iqTSdpPiN1P2JFcpU6apoCQIbwbwCZLDinJJul1Kt4djK3SVsDVuI1riK/8nMbBhFWp5u5V8ypYU/Zf1/gTw5xVuLS6gRBDeDOBMo2r8Mv4X+hZnKtpdwl/C/0A8YAAAAAAAAAAAAB7R0WLA7GTTtOmJSbdt2zl9wsCyGPOPCTR2W0zccrk3HoVWLA4ThNxdxdPuIoWBOeNKSqUm0QFiwAFi+4DgO2LAnDFlHg6JekT9p9CqxYHZTbq23WmpwWL7gAFiwCi3wNOHLFkri29a463xM1nVNrg68GwLMKWJKXqtt8frzJPYsT2enNcyhSrhoS30vafmwLPRMT2fihDZ5/iS4Pja0d1+pXvZdX5s4sRrg2vewL8SWLhunJpvXiVTxpSVOTaIuV8dfM5YACxYAJ0LFgW+kT9piW0zaacm0+NlViwBw7YsAWRx5pJKTVFdiwLfScT25eZXJtu3q2csWAOHbFgcJQk07XE5YsCz0iftP6/wAiePOSpybXQrsWAAFgWYeDKSk4q1FW+4px4PdSlWlNX30TUuPfx4leO/Ul4P8AQJN3144ACgAAAAAAAAAA+he0XiTm4p5r05K2dltEWq3UV4FO07Fi4ONusZ5HV3yarRp/ApxllrLiZrvlVUxu+pmNq2pJprChpyrR8OPl8Tvpi/Kh7keZnfUZ31A9DDx4x/2oy/ibKb1Mud9Tu8fUDVdu6S8CJn3j6jO+oGgGfO+ozvqBoBn3j6jePqBoBn3j6lmzxliTjBOnKSSb4avmBYDR2h2ZibPFSliQknLL6rbfPu7ijZdmlipvPGKTSTlerfLRMtln1OPKcpscBXGE3PJ+9eWu+6Z6PbHYmLskITniQmpyyrI3o6vW0iNMQL9l7K2jGwZY0F6kdL8Cns/ZcTaMRYeG1mab1dLQXybTHAWdodn4+zOKxUlmuqafCr/Us2fsvExMPPGcc2VyUL1aXMnGzlNnxOVnD+zOCzszYMXapuOE1cY5m5SUUlwv4nO0tgxdlxFDFatxUllaaabaT+DKuIAslsGMsHffuaeOvDuK9k2fExpNR5atvgl3i+fTjO3kAR2rCnhTcJvVdHaL8fYMTDw885JNVcXebX3d5eMvL4nL8fKqB3ZNlxMW3F0lVt8FfAhi4M44u6TzSzKKrm3VV5mdTtNz9pA29qdh4+y4axJzhKLaTyt6N+K1WnEz7HsOJjfg8NXWvQqz34qBxYM3iLCX482Wn1NHaXZmLsyi5yTUua6lk2ab+lAJ7NsWLiQc4tJa1bpulenUyub7yC8GfO+pzO+oXGkGbO+o3j6k1ro0gzKcuTZKp5stPNdVzvpQ06LwVbRh4mFLLiRlCXGmq06nMRTi6b1GnVcTw8TK7pPxKp7NjRlGLi7lwVo4sDFbkqfqVmtpVbpXY1OrT6RreSH8vH6ok9q0/BDyPPlKSbTtNaNc7XI0bpV+3jxS9zq3x5W/5TXZOrrdts5tWJmw3olUGtFx0595OGxynNQwcSOI2rdOqWnG+HEy7Zhzw3KE9HXW001o0yL1eYAAgAAAAAAAAAAPUxe1ZYmIsTFm8SSVevbVdDj7Qg01kgrXKL8zzAB6a2/D/Lh5MoePG+PwZjAGzfx6/Bjfx6/BmMAbN/Hr8GN/Hr8GYwBs38evwY38evwZjAGzfx6/Bjfx6/BmMAbN/Hr8GTwdsUJxmmri01adWjAAPY2ztp40cs8qV36sWtfpleB2m8OLjCSim7bypu601a5f1PLBbbfrPHjOMyN+HtijJSUtVrqmW7V2rPGpYmJKSWqTuk+tHlgjT18HtmcMN4cMRqD4pWV7J2k8GefDlUqaunwfE8wC+zKPX23tZ42VTm5KN5W071q78itdoLJWZp1Wif4elnmATz4lkv17PZnbUtlnKeFlblHK1OLaq0+TXQ52p21La8RYmLlTUVFKEWlSbfO+rPHAV6su1pPCWE5+qlXDWulleBtqhK1JpPjpfwPOAvv04/j7G/F2tTk5OVt9xPE27MtZyk3xtP3anmgstnwvt2vSwNuULWdpPikuPTQg9sWbPmqVp2r0a4MwAzidZuvd2/t+ePhxw5yjlTzNRhluWur68e7jwOdn9sLBupyj0qClr11aPDAyLffr0vT6msRSeZO7rn1L+0O2pbRlU5aR5KLWvxPGBZcmRM/b1tn7VeHCUYzaT5Vfc9eRl38ev6mMBWzfx6/A5vo9fgZAFlxr30eo30evwMgJi9q2LHj1/U7DalFqUZNNVTVpquFGIDDtXobRt8sWWbFxJTlVXJuTrpb8WQxNrzu5SbdVrZiAw7N3prtSzvNHRPW1XQlHb2m2pu5VfHWuFnngYdm/B21Qblo20/xJvjz8TR9r63lw+FfhfWzyAMTXq/aslNThJQklXqp6rvu7M+0bXvHKUpXJrp3aIxApoAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/2Q==",
+ "text/html": [
+ "\n",
+ " \n",
+ " "
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3de0a143-4cc7-4e8d-aa5b-c10052a01d1c",
+ "metadata": {},
+ "source": [
+ "# Learning Objectives \n",
+ "- Import and Prepare Data \n",
+ " - Import necessary Python libraries. \n",
+ " - Load and separate te 2021 training dataset into features and labels. \n",
+ "- Create and Train a Decision Tree Model\n",
+ " - Initalize and train a Decision Tree model using the 2021 training data. \n",
+ "- Visualize and Interpret the Decision Tree\n",
+ " - Generate and interpret visual representations of the trained Decision Tree model. \n",
+ "- Make Predictions and Evaluate Accuracy\n",
+ " - Use the trained model to make predictions on 2021 testing data.\n",
+ " - Compare predicted values with actual values to assess model accuracy. \n",
+ "- Calculate and Compare RMSE \n",
+ " - Calculate the Root Mean Square Error (RMSE) for the 2021 model. \n",
+ " - Compare the RMSE of the 2021 model with the 2020 model to evaluate improvements. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "92d71251-8758-46aa-b8e3-d47df89220e8",
+ "metadata": {},
+ "source": [
+ "# Prerequisites \n",
+ "***Modules*** \n",
+ "\n",
+ "- Learning Module ***Introduction to Machine Learning: Decision Trees***\n",
+ "\n",
+ "***Data Sources***\n",
+ "\n",
+ "- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)\n",
+ "- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)\n",
+ "- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)\n",
+ "- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)\n",
+ "\n",
+ "***Libraries/Packages***\n",
+ "\n",
+ "- Pandas\n",
+ "- NumPy\n",
+ "- Matplotlib\n",
+ "- Seaborn\n",
+ "- Scikit-learn"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "564c5c91-57d0-442a-8ece-8719aebde237",
+ "metadata": {},
+ "source": [
+ "# Get Started\n",
+ "Copy-paste the required lines of code from ***Introduction to Machine Learning: Decision Trees*** for each section below. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b8848232-eb1d-430b-b854-2077de1862fd",
+ "metadata": {},
+ "source": [
+ "## **1) Repeat Step 1 (Importing Necessary Packages)**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "fa4b42e6-ac15-4f05-be80-43593f07b5d2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Data Wrangling Imports\n",
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "\n",
+ "# Machine Learning Models Imports\n",
+ "from sklearn import tree\n",
+ "from sklearn.tree import DecisionTreeRegressor\n",
+ "\n",
+ "# Model Evaluation Imports and Visualization\n",
+ "from matplotlib import pyplot as plt\n",
+ "!pip install graphviz\n",
+ "import graphviz\n",
+ "\n",
+ "# Quantitative metrics of Model performance\n",
+ "from sklearn.metrics import mean_squared_error"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6d623983-10a3-4f42-bebd-4045e2c29e98",
+ "metadata": {},
+ "source": [
+ "## **2) Repeat Step 2A (Loading 2021 Training Data)**\n",
+ "##### **NOTES: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it, including the links!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "de7771a0-9d83-4d94-8976-9595b83de3a2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 2A that will load our Summer 2021 training data\n",
+ "S2021_training= pd.read_csv(\"S2021_training.csv\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "497c33d3-6611-476b-99e1-1e9309105962",
+ "metadata": {},
+ "source": [
+ "## **3) Repeat Step 3A (Separate Training Data into LABEL and FEATURES)**\n",
+ "SKIP:\n",
+ "- Steps 3B and 3C, since this step was only done to allow you to see what the labels look like once we separated it from our main training data.\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e12b600c-c282-4d2f-99b4-d3a6f78ab4fe",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 3A that separates the FEATURES & LABEL from the training data \n",
+ "S2021_training_labels = S2021_training[\"cases_per_100000\"]\n",
+ "S2021_training_features = S2021_training.drop(columns=[\"county\",\"cases_per_100000\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b44cea28-b4c6-4aa0-b82a-9f4c6edecf3d",
+ "metadata": {},
+ "source": [
+ "## **4) Repeat steps 4A and 4B (Create your Decision Tree and Train it!)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "b2ab294d-1f15-4adb-9e96-f3122fd146bb",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 4A that will allow us to create our NEW Decision Tree\n",
+ "dtr_summer2021 = DecisionTreeRegressor(random_state = 1, max_depth= 3)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "85b2db25-ee63-4ab7-8452-fcb52a013160",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 4B that will train our NEW Decision Tree\n",
+ "dtr_summer2021 = dtr_summer2021.fit(S2021_training_features,S2021_training_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "070c4b35-1119-4e9b-ae6d-bd00c1734f5b",
+ "metadata": {},
+ "source": [
+ "## **5) Repeat step 5 (Visualize your 2021 Decision Tree)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f9e07c94-5d4d-45b5-96d2-625fe9693c82",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 5 that will let you see the NEW 2021 Decision Tree\n",
+ "dtr_summer2021_dot = tree.export_graphviz(dtr_summer2021, out_file=None, \n",
+ " feature_names=S2021_training_features.columns, \n",
+ " filled=False, rounded=True, impurity=False)\n",
+ "\n",
+ "# Draw graph\n",
+ "dtr_graph = graphviz.Source(dtr_summer2021_dot, format=\"png\") \n",
+ "dtr_graph"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e23a2bbb-591c-47f6-ad24-c413ca634cba",
+ "metadata": {},
+ "source": [
+ "## **6) Repeat step 6A, 6B, 6C.1 (Load Testing Data and make your Predictions)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "d4ce9399-0aa8-48d8-a540-3e8cbcdf7af6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6A to load and see your Summer 2021 testing data\n",
+ "S2021_testing_features = pd.read_csv(\"S2021_test_features.csv\")\n",
+ "S2021_testing_features"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "541994e6-e8a8-4c89-b8a7-7baae502bb8a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6B to drop the county out of the testing data and make your predictions!\n",
+ "S2021_features_test_nocounty = S2021_testing_features.drop(columns=[\"county\"])\n",
+ "S2021_labels_pred = dtr_summer2021.predict(S2021_features_test_nocounty)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "66fceb09-7728-48ef-912f-4d6161d96f61",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from step 6C.1 to look at the labels that our new model has predicted\n",
+ "S2021_labels_preds_df = pd.DataFrame(S2021_labels_pred, columns=[\"Predicted\"])\n",
+ "S2021_labels_preds_df = pd.concat([S2021_testing_features[\"county\"].reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "S2021_labels_preds_df.round(3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "188fd7b3-ad8c-4aae-a835-7dbb2be8bfe8",
+ "metadata": {},
+ "source": [
+ "## **7) Repeat step 7A, 7B (Check the Accuracy of the Predictions of the new Model Created)**\n",
+ "\n",
+ "##### **NOTE: When you copy-paste code, don't forget to change 2020 into 2021, every time you see it!!** "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3e355d76-dabe-489a-8c81-0d839235d183",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 7A to load our ACTUAL 2021 labels and drop the county since it's not part of the labels per se\n",
+ "S2021_testing_labels = pd.read_csv(\"S2021_test_labels.csv\")\n",
+ "S2021_testing_labels = S2021_testing_labels.drop(columns=[\"county\"])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ffd2bed9-c1bd-4c30-b869-cd8cf3c18870",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Copy-paste the code from Step 7B to make a bar graph and inspect the Accuracy of your new 2021 Decision Tree model\n",
+ "pred_vs_test_2021 = pd.concat([S2021_testing_labels.reset_index(drop=True),S2021_labels_preds_df.reset_index(drop=True)],axis=1)\n",
+ "pred_vs_test_2021 = pred_vs_test_2021.loc[:,[\"county\", \"cases_per_100000\",\"Predicted\"]]\n",
+ "pred_vs_test_plot = pred_vs_test_2021.plot.barh(color={\"Predicted\": \"hotpink\", \"cases_per_100000\": \"teal\"},x=\"county\",figsize=(15,15), yticks=np.arange(0,4000,500))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4f751ca0-3db5-4f55-8acd-74afe0be5d9b",
+ "metadata": {},
+ "source": [
+ "### **Walkthrough Solution:**\n",
+ "If you feel stuck on this exercise feel free to follow the video walkthrough below by **Florentine**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "951a16d6-d07a-4a83-8e6c-61f256ace1d7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#Run the command below to watch the video\n",
+ "from IPython.display import YouTubeVideo\n",
+ "\n",
+ "YouTubeVideo('eHI4wMjSGuU', width=800, height=400)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "45d7f1cf-2f56-4a9f-a3e2-5b1923c47066",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "## **8) Extra: (Calculate RMSE and create Aggregate errors histograms)** \n",
+ "\n",
+ "Compare the performance between the model you just created in the practice session, with the old model performance by calculating the RMSE for both and creating an aggregate errors histogram depicting both models."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "623e4e38-9baf-416b-92d1-ffa1312fb20a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Creating residual for our new 2021 model\n",
+ "pred_vs_test_2021['residual'] = pred_vs_test_2021['cases_per_100000'] - pred_vs_test_2021['Predicted']\n",
+ "\n",
+ "# observe now new model with new column\n",
+ "New_model = pred_vs_test_2021\n",
+ "New_model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "745a6b98-f86f-4e3f-908e-b5565871c6c2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Loading old models that will test 2021 data\n",
+ "Old_model = pd.read_csv(\"Model2020pred_vs_test_2021.csv\")\n",
+ "Old_model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "da31b101-37f8-400b-ac4e-631d6b6af428",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Plot histogram of error aggregates for both the old and new model\n",
+ "plt.title('Cases per 100k Prediction Errors')\n",
+ "plt.hist(New_model['residual'], alpha=0.5, label='Model 2021')\n",
+ "plt.hist(Old_model['residual'], alpha=0.5, label='Model 2020')\n",
+ "plt.legend(loc='upper right')\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "82bf9a78-4062-4640-8561-76c623ede7bf",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This calculates the RMSE for Model 2020 (OLD MODEL)\n",
+ "print(f\"RMSE for Model 2020: {mean_squared_error(Old_model['cases_per_100000'], Old_model['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "abeda9a5-5fa0-4544-8586-14ec531448ad",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# This calculates the RMSE for Model 2021 (NEW MODEL)\n",
+ "print(f\"RMSE for Model 2021: {mean_squared_error(New_model['cases_per_100000'], New_model['Predicted'], squared=False)}\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ce454920-68ea-4a48-85f9-7bb57f021b3f",
+ "metadata": {},
+ "source": [
+ "# Conculsion \n",
+ "In this practice, you successfully: \n",
+ "\n",
+ "1. **Imported and Prepared Data:** Loaded the 2021 training dataset and separated it into features and labels. \n",
+ "2. **Created and Trained a Decision Tree Model:** Initalized and trained a decision tree model using the 2021 data.\n",
+ "3. **Visualized and Interpreted the Decision Tree:** Generated and interpreted visual represesntation of the trained model. \n",
+ "4. **Made Predictions and Evaluated Accuracy:** Predicted outcomes using the 2021 testing data and assessed model accuracy. \n",
+ "5. **Calculated and Compareted RMSE:** Calculated the RMSE for the 2021 model and compared it with the 2020 model.\n",
+ "\n",
+ "By completing this module, you have reinforced your understanding of decision trees and gained practical experience in adapting machine learning models to new data. This practice not only enhances your technical skills but also prepares you for real-world applications where models need to be continuously updated and evaluated. Keep exploring and refining your models to achieve even better predictions and insights! "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c71ade49-289b-4211-b75c-1d98c0af452f",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Clean up\n",
+ "\n",
+ "To keep your workspaced organized remember to: \n",
+ "\n",
+ "1. Save your work.\n",
+ "2. Close any notebooks and active sessions to avoid extra charges."
+ ]
+ }
+ ],
+ "metadata": {
+ "environment": {
+ "kernel": "python3",
+ "name": "common-cpu.m108",
+ "type": "gcloud",
+ "uri": "gcr.io/deeplearning-platform-release/base-cpu:m108"
+ },
+ "kernelspec": {
+ "display_name": "conda_python3",
+ "language": "python",
+ "name": "conda_python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/Google Cloud/README.md b/Google Cloud/README.md
new file mode 100644
index 0000000..404cee7
--- /dev/null
+++ b/Google Cloud/README.md
@@ -0,0 +1,101 @@
+## Introduction to Machine Learning for COVID Predictions for Google Cloud
+---------------------------------
+
+## Contents
+
++ [Overview](#overview)
++ [Background](#background)
++ [Before Starting](#before-starting)
++ [Getting Started](#getting-started)
++ [Software Requirements](#software-requirements)
++ [Architecture Design](#architecture-design)
++ [Data](#data)
++ [Funding](#funding)
+
+## **Overview**
+
+This module teaches you how to create a simple Decision Tree using a structured dataset. In addition to the overview given in this README you will find the four Jupyter notebooks. The second notebook is optional.
+- **1- Intro to Machine Learning: Decision Trees**: This notebook provides a basic introduction to Machine Learning concepts, steps for creating and understanding a Decision Tree model, making predictions with it, and intuitively evaluating its performance.
+
+- **2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data**: This notebook is optional, for students who would like to know a bit more about how to evaluate model performance quantitatively, and offers an introduction to why machine learning models require retraining from time to time.
+
+- **3- Practice**: This notebook provides a way to practice and test what you have learned from the first notebook. It includes basic instructions outlining every step discussed in the first notebook. Students are free to either copy and modify the code from the first notebook or they can choose to write it themselves.
+
+- **4- Practice - Answer Key**: This notebook provides the answers and explanation to the previous Practice exercise notebook. Check this notebook only after you have tried to complete the previous exercise yourself.
+
+This module will cost you about $1.00 to run, assuming you tear down all resources upon completion.
+
+
+## Background
+This module is geared towards beginners and does not require prior knowledge on a specific scientific discipline. The module is divided into three Jupyter notebooks as outlined at the beginning of this document. In addition to the notebooks mentioned, there are videos containing brief explanations about basic concepts in machine learning and what the code does in each step of the notebook. Below is an outline of the videos contained in each notebook with their respective links. These videos are already attached to the notebook.
+
+### 1- Introduction To Machine Learning: Decision Trees (10 video clips)
+
+- [Introduction Video by Lorena Benitez](https://youtu.be/e3tGQykFC5M)
+- [Objectives of Exercise](https://youtu.be/_kAjJ8rJwfU)
+- [Step 1: Importing necessary packages into Google Colab](https://youtu.be/jPIQbpdTkbM)
+- [Step 2: Loading training data and making sure it looks correct](https://youtu.be/z9dcLYg65uk)
+- [Step 3: Separate the training dataset into features and labels](https://youtu.be/qh8C0QRECWU)
+- [Step 4: Create a decision tree object and train it](https://youtu.be/M6gY_JywOys)
+- [Step 5: Visualize our trained decision tree](https://youtu.be/cFk6vmfU48w)
+- [Step 6: Make predictions using testing data with our trained decision tree](https://youtu.be/LtD93dB5JzU)
+- [Step 7: Let's see how our decision tree model performed](https://youtu.be/0VK4sLz2wrc)
+- [Step 8: Let's try using our summer 2020 tree model to predict 2021 data](https://youtu.be/2r3ZpwM6xDQ)
+
+### 2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data
+
+### 3- Practice Exercise ( 1 video clip)
+- [Walkthrough Solution](https://youtu.be/eHI4wMjSGuU)
+### 4- Practice Exercise - Answer Key (1 video clip )
+- [Walkthrough Solution](https://youtu.be/eHI4wMjSGuU)
+
+## Before Starting
+
+Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the Google Cloud Platform, you will need to have access to a Google Cloud Platform account, this module is located within Vertex AI Workbench. For more technical information about Google Cloud Platform please click on [this link.](https://github.com/STRIDES/NIHCloudLabGCP)
+
+## **Getting Started**
+
+**1)** Please click on the link for steps to open your GCP project: [How to open your GCP Project](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/open_project_intramural.md).
+
+**2)** Follow the steps highlighted [here](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md) to create a new instance notebook in Vertex AI. Follow steps 1-8 and be especially careful to enable idle shutdown as highlighted in [step 7](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=On%20the%20same%20page%2C%20click%20Enable%20Idle%20Shutdown%20and%20specify%20the%20idle%20minutes%20for%20shutdown.%20This%20means%2C%20if%20you%20close%20your%20browser%20and%20walk%20away%20without%20stopping%20your%20instance%2C%20it%20will%20shutdown%20automatically%20after%20this%20many%20minutes.%20We%20recommend%2030%20minutes.). For this module you should select Debian 11 and Python 3 in the Environment tab in [step 5](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=On%20the%20Environment%20tab%2C). In [step 6](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=GPU%20use.-,Under%20Machine%20type,-select%20your%20desired) in the Machine type tab, select n1-standard-4 from the dropdown box.
+
+**3)** Now you will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your Vertex AI notebook. This can be done by using the `Git` menu in JupyterLab, and selecting the clone option. To clone this repository, use the Git command `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` in the dropdown menu option in Jupyter notebook. Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.
+
+**IMPORTANT NOTE**
+
+Make sure that after you are done with the module, close the tab that appeared when you clicked **OPEN JUPYTERLAB**, then check the box next to the name of the notebook you created in [step 3](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md#:~:text=Click%20Create%20New-,Select,-Advanced%20Options%20at). Then click on **STOP** at the top of the Workbench menu. Wait and make sure that the icon next to your notebook is grayed out.
+
+## **Software Requirements**
+
+Software requirements are satisfied by using a pre-made Google Cloud Platform environment Workbench Notebook. The notebook environment used is named **"Python 3 with IntelĀ® MKL"** ; and it is listed during Step 3 for accessing our module. Software requirements are described in notebook **"Intro to Machine Learning Decision Trees"** step 1.
+
+
+## **Architecture Design**
+
+Submodule 1 and Submodule 3 will download CSV files stored in a Google Cloud Storage bucket to the Workbench notebook, then it will output additional CSV files that will be used optionally if students want to work on the (optional) Submodule 2. Below is a diagram that illustrates our workflow:
+
+![Architecture-diagram.PNG](images/Architecture-diagram.PNG)
+
+## **Data**
+All original data from this module was originally sourced from the following sites:
+
+- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)
+- [COVID vaccination data (Los Angeles Times)](https://github.com/datadesk/california-coronavirus-data)
+- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)
+- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)
+
+We subsequently picked only certain variables of interest, cleaned and created a composite dataset for the years 2020 and 2021 from the sources listed above. **We manipulated the variable named "Unemployment_rate" by using the 2020 rates in both the 2020 and 2021 Datasets**. We then separated these datasets into training, validation, and testing sets for each of these years to streamline the tutorials. Finally, we stored them in our group's [SFSU GitHub repository](https://github.com/MarcMachineLearning/Introduction-to-Machine-Learning/tree/main/Datasets).
+
+## **Funding**
+
+- SFSU/UCSF M.S. Bridges to the Doctorate Program: cloud-based learning modules supplement (T32GM142515)
+- Demystifying Machine Learning and Best Data Practices Workshop Series for Underrepresented STEM Undergraduate and MS Researchers bound for PhD Training Programs (T34-GM008574)
+- The creation of this training module was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number 3T32GM142515-01S1
+
+## **License for Data**
+
+Text and materials are licensed under a Creative Commons CC-BY-NC-SA license. The license allows you to copy, remix and redistribute any of our publicly available materials, under the condition that you attribute the work (details in the license) and do not make profits from it. More information is available [here](https://tilburgsciencehub.com/about/#license).
+
+![Creative commons license](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)
+
+This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/)
diff --git a/Google Cloud/images/Architecture-diagram.PNG b/Google Cloud/images/Architecture-diagram.PNG
new file mode 100644
index 0000000..4fdd5ae
Binary files /dev/null and b/Google Cloud/images/Architecture-diagram.PNG differ
diff --git a/Google Cloud/images/COVID-Decision-Tree.PNG b/Google Cloud/images/COVID-Decision-Tree.PNG
new file mode 100644
index 0000000..c62d0f1
Binary files /dev/null and b/Google Cloud/images/COVID-Decision-Tree.PNG differ
diff --git a/Google Cloud/images/Clone-a-Repository.png b/Google Cloud/images/Clone-a-Repository.png
new file mode 100644
index 0000000..9fc2772
Binary files /dev/null and b/Google Cloud/images/Clone-a-Repository.png differ
diff --git a/Google Cloud/images/Features-for-Prediction.jpg b/Google Cloud/images/Features-for-Prediction.jpg
new file mode 100644
index 0000000..8ec63b3
Binary files /dev/null and b/Google Cloud/images/Features-for-Prediction.jpg differ
diff --git a/Google Cloud/images/GCP-New-notebook.png b/Google Cloud/images/GCP-New-notebook.png
new file mode 100644
index 0000000..38d75b8
Binary files /dev/null and b/Google Cloud/images/GCP-New-notebook.png differ
diff --git a/Google Cloud/images/General-Decision-Tree.png b/Google Cloud/images/General-Decision-Tree.png
new file mode 100644
index 0000000..b8d6289
Binary files /dev/null and b/Google Cloud/images/General-Decision-Tree.png differ
diff --git a/Google Cloud/images/Jupiterlab-terminal.png b/Google Cloud/images/Jupiterlab-terminal.png
new file mode 100644
index 0000000..0cede92
Binary files /dev/null and b/Google Cloud/images/Jupiterlab-terminal.png differ
diff --git a/Google Cloud/images/Label-and-Features.jpg b/Google Cloud/images/Label-and-Features.jpg
new file mode 100644
index 0000000..2ac88d9
Binary files /dev/null and b/Google Cloud/images/Label-and-Features.jpg differ
diff --git a/Google Cloud/images/Model-performance-comparison.jpg b/Google Cloud/images/Model-performance-comparison.jpg
new file mode 100644
index 0000000..15c40e1
Binary files /dev/null and b/Google Cloud/images/Model-performance-comparison.jpg differ
diff --git a/Google Cloud/images/New-Notebook-config1.png b/Google Cloud/images/New-Notebook-config1.png
new file mode 100644
index 0000000..1cb163d
Binary files /dev/null and b/Google Cloud/images/New-Notebook-config1.png differ
diff --git a/Google Cloud/images/New-Notebook-config2.png b/Google Cloud/images/New-Notebook-config2.png
new file mode 100644
index 0000000..997b353
Binary files /dev/null and b/Google Cloud/images/New-Notebook-config2.png differ
diff --git a/Google Cloud/images/New-Notebook-config3.png b/Google Cloud/images/New-Notebook-config3.png
new file mode 100644
index 0000000..cbcd892
Binary files /dev/null and b/Google Cloud/images/New-Notebook-config3.png differ
diff --git a/Google Cloud/images/Shutdown-machine.png b/Google Cloud/images/Shutdown-machine.png
new file mode 100644
index 0000000..2978916
Binary files /dev/null and b/Google Cloud/images/Shutdown-machine.png differ
diff --git a/Google Cloud/images/Summer-2020-model-performance-comparison.jpg b/Google Cloud/images/Summer-2020-model-performance-comparison.jpg
new file mode 100644
index 0000000..15c40e1
Binary files /dev/null and b/Google Cloud/images/Summer-2020-model-performance-comparison.jpg differ
diff --git a/Google Cloud/images/Testing-Data.jpg b/Google Cloud/images/Testing-Data.jpg
new file mode 100644
index 0000000..e8ad190
Binary files /dev/null and b/Google Cloud/images/Testing-Data.jpg differ
diff --git a/Google Cloud/images/Training-Data.jpg b/Google Cloud/images/Training-Data.jpg
new file mode 100644
index 0000000..066d6ac
Binary files /dev/null and b/Google Cloud/images/Training-Data.jpg differ
diff --git a/Google Cloud/quiz_files/quiz1.json b/Google Cloud/quiz_files/quiz1.json
new file mode 100644
index 0000000..25b874b
--- /dev/null
+++ b/Google Cloud/quiz_files/quiz1.json
@@ -0,0 +1,28 @@
+[
+ {
+ "question": "Which of the following statements is the best description of the project you are working on here?",
+ "type": "multiple_choice",
+ "answers": [
+ {
+ "answer": "We are creating a model to better understand what determines the number of COVID cases in a county.",
+ "correct": false,
+ "feedback": "While we may gain insights into what is correlated with COVID cases in a county we will not know for certain what determines it. Please try again."
+ },
+ {
+ "answer": "We are creating a model to be able to predict the number of COVID cases in a county.",
+ "correct": true,
+ "feedback": "Correct. Given data such as population, vaccination rates, and other information, our Decision Tree will predict the number of COVID cases for that county."
+ },
+ {
+ "answer": "We are learning about the biology of COVID transmission.",
+ "correct": false,
+ "feedback": "There is no knowledge of disease transmission in this project. Please try again."
+ },
+ {
+ "answer": "We are trying to determine whether SF county had more or less COVID cases than LA county.",
+ "correct": false,
+ "feedback": "We can verify this from the datasets themselves but this is not our aim in this project. Please try again."
+ }
+ ]
+ }
+]
\ No newline at end of file
diff --git a/Google Cloud/quiz_files/quiz2.json b/Google Cloud/quiz_files/quiz2.json
new file mode 100644
index 0000000..89e94cf
--- /dev/null
+++ b/Google Cloud/quiz_files/quiz2.json
@@ -0,0 +1,18 @@
+[
+ {
+ "question": "Which of these statements is true?",
+ "type": "multiple_choice",
+ "answers": [
+ {
+ "answer": "We use the features to predict the labels.",
+ "correct": true,
+ "feedback": "Correct. In our example, features include data like population and vaccination rate, while the label is COVID cases per 100k people."
+ },
+ {
+ "answer": "We use the labels to predict the features.",
+ "correct": false,
+ "feedback": "Labels are not used to predict features. Please try again."
+ }
+ ]
+ }
+]
\ No newline at end of file
diff --git a/Google Cloud/quiz_files/quiz3.json b/Google Cloud/quiz_files/quiz3.json
new file mode 100644
index 0000000..1f9cb4e
--- /dev/null
+++ b/Google Cloud/quiz_files/quiz3.json
@@ -0,0 +1,28 @@
+[
+ {
+ "question": "Choose the correct statement.",
+ "type": "multiple_choice",
+ "answers": [
+ {
+ "answer": "We use the test data to create the training data and the training data to train the Decision Tree.",
+ "correct": false,
+ "feedback": "Training data is not derived from test data. Please try again."
+ },
+ {
+ "answer": "We use the test data to train the Decision Tree and the training data to determine how well the model performs.",
+ "correct": false,
+ "feedback": "Test data is not used to train the Decision Tree nor is training data used to evaluate the model. Please try again."
+ },
+ {
+ "answer": "We use the training data to train the Decision Tree and the test data to determine how well the model performs.",
+ "correct": true,
+ "feedback": "Correct. "
+ },
+ {
+ "answer": "We use the training data to create the test data and the test data to train the Decision Tree.",
+ "correct": false,
+ "feedback": "Test data is not derived from training data nor is test data used to train the Decision Tree. Please try again."
+ }
+ ]
+ }
+]
\ No newline at end of file
diff --git a/README.md b/README.md
index d6a01a4..b555fa1 100644
--- a/README.md
+++ b/README.md
@@ -3,45 +3,22 @@
# San Francisco State University
## Introduction to Machine Learning for COVID Predictions
---------------------------------
-This module teaches you how to create a simple Decision Tree using a structured dataset. In addition to the overview given in this README you will find the four Jupyter notebooks. The second notebook is optional.
-- **1- Intro to Machine Learning: Decision Trees**: This notebook provides a basic introduction to Machine Learning concepts, steps for creating and understanding a Decision Tree model, making predictions with it, and intuitively evaluating its performance.
-
-- **2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data**: This notebook is optional, for students who would like to know a bit more about how to evaluate model performance quantitatively, and offers an introduction to why machine learning models require retraining from time to time.
-
-- **3- Practice**: This notebook provides a way to practice and test what you have learned from the first notebook. It includes basic instructions outlining every step discussed in the first notebook. Students are free to either copy and modify the code from the first notebook or they can choose to write it themselves.
-
-- **4- Practice - Answer Key**: This notebook provides the answers and explanation to the previous Practice exercise notebook. Check this notebook only after you have tried to complete the previous exercise yourself.
-This module will cost you about $1.00 to run, assuming you tear down all resources upon completion.
+## Contents
-## Overview of Page Contents
-
-+ [Getting Started](#getting-started)
+ [Overview](#overview)
++ [Getting Started](#getting-started)
+ [Software Requirements](#software-requirements)
-+ [Workflow Diagrams](#workflow-diagrams)
+ [Data](#data)
+ [Funding](#funding)
-## **Getting Started**
-
-Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated. In order to get to this module from the Google Cloud Platform, you will need to have access to a Google Cloud Platform account, this module is located within Vertex AI Workbench. For more technical information about Google Cloud Platform please click on the following link: [NIH Cloud Lab README](https://github.com/STRIDES/NIHCloudLabGCP)
-
-**1)** Please click on the link for steps to open your GCP project: [How to open your GCP Project](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/open_project_intramural.md).
-
-**2)** Follow the steps highlighted [here](https://github.com/STRIDES/NIHCloudLabGCP/blob/main/docs/vertexai.md) to create a new user-managed notebook in Vertex AI. Follow steps 1-8 and be especially careful to enable idle shutdown as highlighted in step 7. For this module you should select Debian 11 and Python 3 in the Environment tab in step 5. In step 6 in the Machine type tab, select n1-standard-4 from the dropdown box.
-**3)** Now you will need to download the tutorial files from GitHub. The easiest way to do this would be to clone the repository from NIGMS into your Vertex AI notebook. This can be done by using the `Git` menu in JupyterLab, and selecting the clone option. To clone this repository, use the Git command `git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git` in the dropdown menu option in Jupyter notebook. Please make sure you only enter the link for the repository that you want to clone. There are other bioinformatics related learning modules available in the [NIGMS Repository](https://github.com/NIGMS). This will download our tutorial files into a folder called `Introduction-to-Data-Science-for-Biology`.
-
-
-**IMPORTANT NOTE**
-
-Make sure that after you are done with the module, close the tab that appeared when you clicked **OPEN JUPYTERLAB**, then check the box next to the name of the notebook you created in step 3. Then click on **STOP** at the top of the Workbench menu. Wait and make sure that the icon next to your notebook is grayed out.
-
-## **Overview**
+## **Overview**
This module is geared towards beginners and does not require prior knowledge on a specific scientific discipline. The module is divided into three Jupyter notebooks as outlined at the beginning of this document. In addition to the notebooks mentioned, there are videos containing brief explanations about basic concepts in machine learning and what the code does in each step of the notebook. Below is an outline of the videos contained in each notebook with their respective links. These videos are already attached to the notebook.
+This module offers two computing pathways: [AWS (Amazon Web Services)](https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology/tree/AWS%26GCP/AWS) or [GCP (Google Cloud Platform)](https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology/tree/AWS%26GCP/Google%20Cloud). Users can choose their preferred cloud service to run the Jupyter notebooks, ensuring flexibilty and accessibilty based on their existing infrastructure or familairty. Detailed instructions for setting up and using either AWS or GCP for this module are provided within their corresponding folders within this repository.
+
### 1- Introduction To Machine Learning: Decision Trees (10 video clips)
- [Introduction Video by Lorena Benitez](https://youtu.be/e3tGQykFC5M)
@@ -62,18 +39,47 @@ This module is geared towards beginners and does not require prior knowledge on
### 4- Practice Exercise - Answer Key (1 video clip )
- [Walkthrough Solution](https://youtu.be/eHI4wMjSGuU)
+This module teaches you how to create a simple Decision Tree using a structured dataset. In addition to the overview given in this README you will find the four Jupyter notebooks. The second notebook is optional.
+
+- **1- Intro to Machine Learning: Decision Trees**: This notebook provides a basic introduction to Machine Learning concepts, steps for creating and understanding a Decision Tree model, making predictions with it, and intuitively evaluating its performance.
+
+- **2- (Optional) Quant. Comparison of 2020 DT Model Performance for (2020 vs 2021) Data**: This notebook is optional, for students who would like to know a bit more about how to evaluate model performance quantitatively, and offers an introduction to why machine learning models require retraining from time to time.
+
+- **3- Practice**: This notebook provides a way to practice and test what you have learned from the first notebook. It includes basic instructions outlining every step discussed in the first notebook. Students are free to either copy and modify the code from the first notebook or they can choose to write it themselves.
+
+- **4- Practice - Answer Key**: This notebook provides the answers and explanation to the previous Practice exercise notebook. Check this notebook only after you have tried to complete the previous exercise yourself.
-## **Software Requirements**
-Software requirements are satisfied by using a pre-made Google Cloud Platform environment Workbench Notebook. The notebook environment used is named **"Python 3 with IntelĀ® MKL"** ; and it is listed during Step 3 for accessing our module. In addition all package requirements are installed by following the instructions Step 1 of the notebook **"Intro to Machine Learning Decision Trees".**
+## **Getting Started**
+
+Included is a tutorial in the form of Jupyter notebooks. The main purpose of the tutorial is to help beginners without much coding experience to familiarize themselves with basic fundamental concepts within machine learning using health data (COVID dataset). It is also meant to be extended to other kinds of structured data. The tutorial walks through step by step the process of creating a Decision Tree and interpreting it. This module intends to provide an intuitive understanding of how machine learning model performance is evaluated.
+
+To access this module, you will need to choose the corresponding folder (AWS or GCP) within the repository. Further instructions are contained within each folder to guide you through the setup process.
+
+For ***Google Cloud Platform (GCP):***
+1. Detailed instructions for setting up and using GCP for this module are provided within the GoogleCloud folder in the repository.
+2. Follow the steps to create a new user-managed notebook in Vertex AI Workbench, ensuring you select the appropriate configurations as outlined in the GCP instructions.
+3. Clone the repository from NIGMS into your Vertex AI notebook using the Git command:
+ git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git
+ This will download our tutorial files into a folder called Introduction-to-Data-Science-for-Biology
+
+For ***Amazon Web Services (AWS):***
+1. Detailed instructions for setting up and using AWS for this module are provided within the AWS folder in the repository.
+2. Follow the steps to create a new Jupyter notebook instance in Amazon SageMaker, ensuring you select the appropriate configurations as outlined in the AWS instructions.
+3. Clone the repository from NIGMS into your SageMaker notebook using the Git command:
+ git clone https://github.com/NIGMS/Introduction-to-Data-Science-for-Biology.git
+ This will download our tutorial files into a folder called Introduction-to-Data-Science-for-Biology.
-## **Workflow Diagrams**
+Please refer to the specific instructions within each folder (AWS or GCP) for more detailed ssetup guidance.
-Submodule 1 and Submodule 3 will download CSV files stored in a Google Cloud Storage bucket to the Workbench notebook, then it will output additional CSV files that will be used optionally if students want to work on the (optional) Submodule 2. Below is a diagram that illustrates our workflow:
-![Architecture-diagram.PNG](images/Architecture-diagram.PNG)
+## **Software Requirements**
+Please refer to the specific software requirements within each folder (AWS or GCP) for more detailed setup guidance. In addition all package requirements are installed by following the instructions Step 1 of the notebook **"Intro to Machine Learning Decision Trees".**
+
+
## **Data**
+
All original data from this module was originally sourced from the following sites:
- [COVID cases data (California Health and Human Services Agency)](https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state/resource/046cdd2b-31e5-4d34-9ed3-b48cdbc4be7a)
@@ -81,7 +87,6 @@ All original data from this module was originally sourced from the following sit
- [Unemployment data (California Employment Development Dept.)](https://data.edd.ca.gov/Labor-Force-and-Unemployment-Rates/Local-Area-Unemployment-StatisticsdecisionLAUS-/e6gw-gvii)
- [Election data (Harvard University)](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ)
-We subsequently picked only certain variables of interest, cleaned and created a composite dataset for the years 2020 and 2021 from the sources listed above. **We manipulated the variable named "Unemployment_rate" by using the 2020 rates in both the 2020 and 2021 Datasets**. We then separated these datasets into training, validation, and testing sets for each of these years to streamline the tutorials. Finally, we stored them in our group's [SFSU GitHub repository](https://github.com/MarcMachineLearning/Introduction-to-Machine-Learning/tree/main/Datasets).
## **Funding**
diff --git a/images/Architecture-diagram.PNG b/images/Architecture-diagram.PNG
deleted file mode 100644
index b172fc9..0000000
Binary files a/images/Architecture-diagram.PNG and /dev/null differ