Skip to content

A data pipeline and analysis of my spotify listening habits

License

Notifications You must be signed in to change notification settings

vatdaell/spotify-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spotify Analysis

An exploration of my personal Spotify listening habits

Table of Contents

Description
Installation
Usage
Roadmap
License

Description

Exploring my personal Spotify listening habits. The project has a data pipeline that extracts data from Genius and Spotify api for further analysis. The project stores raw and intermediary data in a s3 data lake to be further processed and loaded.

Installation

  1. Clone this repository.
git clone https://github.com/vatdaell/spotify-analysis.git
  1. Install all the python packages
pip install -r src/ETL/requirements.txt -r src/ReportGenerator/requirements.txt
  1. Set up an AWS account with an s3 bucket.

  2. Create a Spotify developer account and create an application

  3. Create a Genius Account account and generate a client token

  4. Setup a MySQL database for use

  5. Create a .env file in the project directory and fill in the details.

S3_BUCKET=bucket_name
SPOTIPY_CLIENT_ID=clientid
SPOTIPY_CLIENT_SECRET=secret
SPOTIPY_REDIRECT_URI=redirect_uri
TABLE_NAME=recent_plays_table_name
GENIUS_ACCESS_TOKEN=genius_access_token
MYSQL_HOST=mysql_host
MYSQL_PORT=port
MYSQL_USER=user
MYSQL_PASS=password
MYSQL_DB=dbname

Usage

To load songs listened to to s3 bucket and load songs data to mysql database along with loading recently played data to mysql database

python src/etl/songs_pipeline.py
python src/etl/recently_played_pipeline.py

Roadmap

Some interesting features I want to implement/analyze in the future

  • Use a task scheduler to automate etl tasks
  • Extract lyrics of recently listened songs for sentiment analysis
  • Recommend similar songs based on listening history
  • Link to merch store for top bands
  • Analysis of genre of music listened to

License

MIT

About

A data pipeline and analysis of my spotify listening habits

Topics

Resources

License

Stars

Watchers

Forks

Languages