Skip to content

Analyze a dataset of cryptocurrencies to uncover hidden trends that can lead to great investment opportunities. Used unsupervised algorithms. Worked primarily with K-means algorithm.

Notifications You must be signed in to change notification settings

g626s/Cryptocurrencies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cryptocurrencies

Analysis for clients who are preparing to get into the cryptocurrency market.

Resources:

Dataset:
Software and IDE:
  • Python
  • Jupyter Notebook
  • Libraries:
    • Pandas
    • Sklearn
    • hvPlot
  • Unsupervised Machine Learning

Project Overview

Accountability Accounting, a prominent investment bank, is interested in offering a new cryptocurrency investment portfolio for its customers. As the cyptocurrency market is highly saturated and volatile, we created a report that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment. The intial data was not ideal, so the data had to be processed to fit the machine learning models. Since there is no known output, we decided to use unsupervised learning. To group the cryptocurrencies, we incorporated a clustering algorithm. Lastly, we used data visualizations to share our findings.

For the technical analysis of this project, we:

  • Preprocessing the Data for PCA
  • Reducing Data Dimensions Using PCA
  • Clustering Cryptocurrencies Using K-means
  • Visualizing Cryptocurrencies Results

Results and Methods

Preprocessing the Data for PCA

Using our knowledge of Pandas, we preprocessed the dataset in order to perform PCA. The crypto_data.csv was retrieved from CryptoCompare. For this section of our project and methodology, we kept all the cryptocurrencies that are being traded. We then drop the IsTrading column and removed rows that have at least one null value. From this we then filtered the crypto_df DataFrame so it only has rows where coins have been mined. After we filtered the DataFrame, we create a new DataFrame that holds only the cryptocurrency names, and use the crypto_df DataFrame index as the index for the new DataFrame. A crucial step in this process was to remove the CoinName column from the crypto_df DataFrame since it's not going to be used on the clustering algorithm. The get_dummies() method was incorporated to create variables for the two text features, Algorithm and ProofType, and store the resulting data in a new DataFrame named X. Lastly, we used the StandardScaler fit_transform() function to standardize the features from the X DataFrame.

Screen Shot 2022-10-22 at 5 06 42 PM

Reducing Data Dimensions Using PCA

Using our knowledge of how to apply the Principal Component Analysis (PCA) algorithm, we reduced the dimensions of the X DataFrame to three principal components and place these dimensions in a new DataFrame.

Screen Shot 2022-10-22 at 5 08 38 PM

Clustering Cryptocurrencies Using K-means

Using our knowledge of the K-means algorithm, we created an elbow curve using hvPlot to find the best value for K from the pcs_df DataFrame that was created previously. Then, we ran the K-means algorithm to predict the K clusters for the cryptocurrencies’ data.

Screen Shot 2022-10-22 at 5 11 14 PM

Visualizing Cryptocurrencies Results

Using our knowledge of creating scatter plots with Plotly Express and hvplot, we visualized the distinct groups that correspond to the three principal components we created previously, then we created a table with all the currently tradable cryptocurrencies using the hvplot.table() function.

Screen Shot 2022-10-22 at 5 15 01 PM

About

Analyze a dataset of cryptocurrencies to uncover hidden trends that can lead to great investment opportunities. Used unsupervised algorithms. Worked primarily with K-means algorithm.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published