Project Description

Cryptocurrency Returns Predictor

An Application of Random Forest!

Project Description

Introduction

Objective: Project for my intern at Research Center VERA, Ca' Foscari University of Venice.
Abstract: Use sentiment-based features to predict cryptocurrency returns. Models used: Random Forest Classifier, Random Forest Regressor, and VAR time-series model. Analysis timeframe: 28/11/2014 - 25/07/2020.
Status: Completed.

Methods Used

Random Forests (Regressor & Classifier)
Principal Component Analysis
Vector Autoregression (VAR) model
Sentiment Indicators (retrieved from my graduation thesis)

Dependencies

Python 3
numpy==1.18.5
pandas==1.0.5
scikit-learn==0.23.2
statsmodels==0.12.0
plotly==4.9.0

Interesting Results to Keep You Reading

Backtesting strategies based on 3 models:

Generate trading signals: Long as predicted return > 0, short as predicted return < 0, wait otherwise.
Test period (25% of the dataset): 05/03/2019 - 25/07/2020
RF Classifier outperforms significantly both strategies and also the simple buy-and-hold strategy.
Download the interactive version.

Getting Started

How to Run

Clone this repo:
git clone https://github.com/dang-trung/crypto-return-predictor
Create your environment (virtualenv):
virtualenv -p python3 venv
source venv/bin/activate (bash) or venv\Scripts\activate (windows)
(venv) cd crypto-return-predictor
(venv) pip install -e

Or (conda):
conda env create -f environment.yml
conda activate crypto-return-predictor
Run in terminal:
python -m crypto_return_predictor

Dependent Variable/Target

Cryptocurrency market returns (computed using the market index CRIX, retrieved here, see more on how the index is created at Trimborn & Härdle (2018) or those authors' website.)

Sentiment Measures

Sentiment score of Messages on StockTwits, Reddit Submissions, Reddit Comments
- Computed using dictionary-based sentiment analysis, lexicon used: crypto-specific lexicon by Chen et al (2019), retrieved at the main author's personal page.
- StockTwits messages are retrieved through StockTwits Public API, Reddit data are retrieved using PushShift.io Reddit API.
Messages volume on StockTwits, Reddit Submissions, Reddit Comments.
Market volatility index VCRIX (see how the index is created: Kolesnikova (2018), retrieved here.)
Market trading volume (retrieved using Nomics Public API)

Read more on how I retrieve these sentiment measures in my graduation thesis or its Github repo.

Features Selection

For VAR model: Lagged values of the first principal component of all 9 sentiment measures (up to 5 lags).
For Random Forests: Sentiment measures' lagged Values (up to 5 lags).

Results (Test Period)

Order by performance (from high to low):

Random Forest Classifier:

Accuracy: 61.86%
Confusion matrix:

		Actual
		Negative	Unchanged	Positive
Predicted	Negative	145	0	97
	Unchanged	1	0	0
	Positive	96	0	170

Backtesting daily returns: ~91bps

VAR(5):

Accuracy: 54.62%
Confusion matrix:

		Actual
		Negative	Unchanged	Positive
Predicted	Negative	57	0	185
	Unchanged	0	0	1
	Positive	45	0	221

Backtesting daily returns: ~48bps

Random Forest Regressor:

Accuracy: 56.19%
Confusion matrix:

		Actual
		Negative	Unchanged	Positive
Predicted	Negative	222	0	20
	Unchanged	1	0	0
	Positive	202	0	64

Backtesting daily returns: ~19bps (just slightly better than holding the CRIX index)

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
crypto_return_predictor		crypto_return_predictor
data		data
figures		figures
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cryptocurrency Returns Predictor

Project Description

Introduction

Methods Used

Dependencies

Interesting Results to Keep You Reading

Table of Contents

Getting Started

How to Run

Dependent Variable/Target

Sentiment Measures

Features Selection

Results (Test Period)

Read More

About

Packages

Languages

License

dang-trung/crypto-return-predictor

Folders and files

Latest commit

History

Repository files navigation

Cryptocurrency Returns Predictor

Project Description

Introduction

Methods Used

Dependencies

Interesting Results to Keep You Reading

Table of Contents

Getting Started

How to Run

Dependent Variable/Target

Sentiment Measures

Features Selection

Results (Test Period)

Read More

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Languages

Packages