Skip to content

dhuy237/sparkify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Sparkify

This repository is the work for my capstone project from the Udacity Data Scientist Nanodegree Program. In this project, I will analyze the data from Sparkify to predict customer churn.

Sparkify is a simulation data of a subscription-based company that provide music service like Spotify, Apple Music, etc. Customer churn prediction is a very challenging and common task for a data scientist or analyst to improve a company's business. Processing and analyzing a large amount of data with Spark is also a must-have skill in the data fields.

🚀 Table of contents

  1. Prerequisites
  2. Project Motivation
  3. Instructions
  4. Results
  5. Acknowledgements

Prerequisites

These are libraries that is used in this project:

  • PySpark

Instructions

  1. Install PySpark
  2. Run the notebook Sparkify.ipynb

Results

The findings of this project has been published here.

Acknowledgements

This project use disaster data from Sparkify.

The code is inspired by Udacity Data Scientist Nanodegree Program.

🔨 Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project.
  2. Create your Feature Branch (git checkout -b feature/Feature).
  3. Commit your Changes (git commit -m 'Add some feature').
  4. Push to the Branch (git push origin feature/Feature).
  5. Open a Pull Request.

📫 Contact

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published