From a9af15c0d6a8fe0abd7a6d3fa2b685a24012f474 Mon Sep 17 00:00:00 2001 From: Juanita Gomez Date: Thu, 27 Feb 2020 15:54:41 -0500 Subject: [PATCH] Update README.md --- README.md | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index 77e81ac..542d3bc 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ git clone https://github.com/juanis2112/Spyder-Workshop ``` ## Importing Libraries and Data -The first thing we need to do before starting our work, is importing the libraries necessary for our analysis and saving the data in a way that is easy to explore. +The first thing we need to do before starting our work, is importing the libraries necessary for our analysis and saving the data in a way that it's easy to explore. 1. Import the libraries matplotlib and pandas. 2. Save the data from the csv file in a Pandas DataFrame using the command @@ -78,7 +78,7 @@ weather_data_ordered.plot( ``` 10. Open the Plots Pane to view your plot. 11. Now try plotting Temperature (C) V.S the Date using only the data from 2006 which corresponds to the first 8759 rows of the DataFrame. -12. Plot temperature and humidity V.S. the Date in the same plot, using the following command. +12. Plot temperature and humidity V.S. the Date in the same plot, using the following command ```python weather_data_ordered.plot(subplots=True, x= 'Formatted Date', y= ['Temperature (C)', 'Humidity'],figsize=(15, 8)) @@ -109,14 +109,12 @@ to plot the values of the variables for each year. Now, we want to evaluate the relationships between the variables in our data set. For this, we have written a function in the file 'utils.py' which should be in the same folder of your workshop. -16. Import the function +16. Import the function from this file, to be able to use it ```python from utils import (aggregate_by_year, plot_correlations) ``` -from this file, to be able to use it. - 17. Plot the correlations between the variables using the command ```python @@ -131,7 +129,7 @@ from utils import (aggregate_by_year, plot_correlations, plot_color_gradients) ``` which is also in the utils.py file which will help you plot the colormap gradient to be able to interpret your correlations plot. -20. Plot the colormap gradient using the following commands. +20. Plot the colormap gradient using the following commands ```python cmap_category, cmap_list = ('Plot gradients convention', ['viridis', ]) @@ -139,20 +137,20 @@ cmap_category, cmap_list = ('Plot gradients convention', ['viridis', ]) plot_color_gradients(cmap_category, cmap_list) ``` -21. Calculate the correlations between the different variables in our data set usgin the following command +21. Calculate the correlations between the different variables in our data set using the following command ```python weather_correlations = weather_data_ordered.corr() ``` 22. Open the variable `weather_correlations`in the Variable Explorer. -23. Use the following command +23. Use the following command in the console to get the correlation between the Humidity and Temperature ```python weather_data_ordered['Temperature (C)'].corr(weather_data_ordered['Humidity']) ``` -in the console to get the correlation between the Humidity and Temperature. Verify it has the same value in the correlations DataFrame. +Verify it has the same value in the correlations DataFrame. 24. Try calculating correlations between different variables and comparing them with the ones in the data frame. @@ -161,14 +159,14 @@ Finally, we want to use our data to construct a model that allows us predicting We are going to use scikit-learn which is a python library that contains tools to explore data and build different types of predictive models. We will use two functions for this task which need to be imported. -25. Use the following command to import the necessary libraries for our data modeling. +25. Use the following command to import the necessary libraries for our data modeling ```python from sklearn.model_selection import train_test_split from sklearn import linear_model ``` -A classic way to make a predictive model is to subdivide the total set of data into two sets: training and test. The training data will help us to train our predictive model, while the test data will play the role of future observations and give us an idea is how good our prediction is. +A classic way to make a predictive model is to subdivide the total set of data into two sets: training and test. The training data will help us to train our predictive model, while the test data will play the role of future observations and give us an idea of how good our prediction is. 26. Use the follwing command @@ -194,7 +192,7 @@ regresion.fit(X_train.values.reshape(-1, 1), Y_train.values.reshape(-1, 1)) print(regresion.intercept_, regresion.coef_) ``` -and save them in variables so we can use them with: +and save them in variables so we can use them with ```python beta_0 = regresion.intercept_[0] @@ -218,18 +216,16 @@ plt.legend() plt.show() ``` -31. Using the coefficients found in our model, predict the temperature for a given level of humidity using the `predicted_temperature` function available in 'utils'. +31. Using the coefficients found in our model, predict the temperature for a given level of humidity using the `predicted_temperature` function available in 'utils.py'. -Finally, we can numerically evaluate how good was our model predicted. For this we will use the `explained_variance_score` metric available in sklearn.metrics. This metric is calculated as 1-(Var(Y_real-Y_model)/Var(Y_real)) which means that the closer the value is to 1, the better our model. +Finally, we can numerically evaluate how good was our model predicted. For this we will use the `explained_variance_score` metric available in sklearn.metrics. This metric is calculated as `$$1-(Var(Y_real-Y_model)/Var(Y_real))$$` which means that the closer the value is to 1, the better our model. -32. Use the following command +32. Use the following command to import the function that evaluates how good is our model ```python from sklearn.metrics import explained_variance_score ``` -to import the function that evaluates how good is our model. - 33. Calculate the explained variance score and print it using the following ```python