- Code the elbow method algorithm to find the best value for k. Use a range from 1 to 11. (5 points)
- Visually identify the optimal value for k by plotting a line chart of all the inertia values computed with the different values of k. (5 points)
- Answer the following question: What’s the best value for k? (5 points)
- Initialize the K-means model with four clusters by using the best value for k. (1 point)
- Fit the K-means model by using the original data. (1 point)
- Predict the clusters for grouping the cryptocurrencies by using the original data. Review the resulting array of cluster values. (3 points)
- Create a copy of the original data, and then add a new column of the predicted clusters. (1 point)
- Using pandas’ plot, create a scatter plot by setting x="price_change_percentage_24h" and y="price_change_percentage_7d". (4 points)
- Create a PCA model instance, and set n_components=3. (1 point)
- Use the PCA model to reduce the features to three principal components, then review the first five rows of the DataFrame. (2 points)
- Get the explained variance to determine how much information can be attributed to each principal component. (2 points)
- Answer the following question: What’s the total explained variance of the three principal components? (3 points)
- Create a new DataFrame with the PCA data. Be sure to set the coin_id index from the original DataFrame as the index for the new DataFrame. Review the resulting DataFrame. (2 points)
- Code the elbow method algorithm, and use the PCA data to find the best value for k. Use a range from 1 to 11. (2 points)
- Visually identify the optimal value for k by plotting a line chart of all the inertia values computed with the different values of k. (5 points)
- Answer the following questions: What’s the best value for k when using the PCA data? Does it differ from the best value for k that you found by using the original data? (3 points)
- Initialize the K-means model with four clusters by using the best value for k. (1 point)
- Fit the K-means model by using the PCA data. (1 point)
- Predict the clusters for grouping the cryptocurrencies by using the PCA data. Review the resulting array of cluster values. (3 points)
- Create a copy of the DataFrame with the PCA data, and then add a new column to store the predicted clusters. (1 point)
- Using pandas’ plot, create a scatter plot by setting x="PC1" and y="PC2". (4 points)
- Create a DataFrame that shows the weights of each feature (column) for each principal component by using the columns from the original scaled DataFrame as the index. (10 points)
- Answer the following question: Which features have the strongest positive or negative influence on each component? (5 points)
- Place imports at the top of the file, just after any module comments and docstrings, and before module globals and constants. (3 points)
- Name functions and variables with lowercase characters, with words separated by underscores. (2 points)
- Follow DRY (Don't Repeat Yourself) principles, creating maintainable and reusable code. (3 points)
- Use concise logic and creative engineering where possible. (2 points)
- Submit a link to a GitHub repository that’s cloned to your local machine and that contains your files. (4 points)
- Use the command line to add your files to the repository. (3 points)
- Include appropriate commit messages in your files. (3 points)
- Be well commented with concise, relevant notes that other developers can understand. (10 points)