Customer Churn Prediction

Maintaining customer retention is crucial as it fosters brand loyalty, reducing the need for continuous marketing efforts and generating a consistent revenue stream. Satisfied and loyal customers not only contribute to sustained profitability but also serve as advocates who can positively influence new customer acquisition through word-of-mouth recommendations.

Background

In this project, we are given dataset of customer profile. We are asked to analyze behaviour of churned customer and build machine learning model to predict whether a customer will churn or not based on the data available.

Data

The dataset can be accessed through Kaggle. The dataset contains 12 features: CustomerID, Age, Gender, Tenure, Usage Frequency, Support Calls, Payment Delay, Subscription Type, Contract Length, Total Spend, Last Interaction and Churn. This dataset have 505.256 rows of data, separated into two file (training data with 440.882 rows of data and testing data with 64.374 rows of data).

Process

The notebook is structured this way:

Business Understanding: Explain the importance of customer churn prediction
Data Understanding: Provide an explanation of the dataset used, data sources, and features.
Data Pre-Processing: Perform initial data inspection as an overview of the data to be used, perform initial data cleaning, divide the data into train-test data and train-validation-test data for modeling purposes, check for missing values and duplicate data.
Exploratory Data Analysis: Perform visualization and analysis of the dataset.
Feature Analysis: Looking at the correlation between features
Modeling: Build several Machine Learning baseline models (Logistic Regression, Random Forest, and XGBoost) for customer churn prediction & measure their performance.
Modeling with Oversampling: Building Machine Learning models using oversampling method (SMOTE).
Key Insight & Recommendation: Key insights and results of Machine Learning modeling for customer churn prediction and recommendations for actions that can be taken to reduce and prevent customer churn.

Highlighted Key Findings

A feature which is the strongest predictor of customer churn: Support Calls. This can refer to bad customer service (customer with more Support Calls tend to churn) or bad service. The company have to investigate to understand why there are many customers with very high Support Calls that leads to churn and address the issue immediately.

Boxplot of Customers' Support Calls, categorized by Churn

The distribution plot of Customers' Support Calls. It can be seen that there are many churned customers with very high Support Calls.

3 Machine Learning methods used (Logistic Regression, Random Forest, and XGBoost) show great performance on predicting churned customer.
- Logistic Regression have 98% recall (standard) and 97% recall (oversampling using SMOTE)
- Random Forest have 100% recall (standard) and 100% recall (oversampling using SMOTE)
- Logistic Regression have 100% recall (standard) and 100% recall (oversampling using SMOTE)
It means all three models predict over 95 churned customer out of 100 predicted churned customer. The company can use one of the models to predict churned customers.

For comprehensive process & results, please check the script provided.

Interested to see presentation version of this project? You can view the presentation version here in my Linkedin!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Customer_Churn_Prediction_Script.ipynb		Customer_Churn_Prediction_Script.ipynb
README.md		README.md
customer_churn_dataset-testing-master.csv		customer_churn_dataset-testing-master.csv
customer_churn_dataset-training-master.csv		customer_churn_dataset-training-master.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Churn Prediction

Background

Data

Process

Highlighted Key Findings

About

Releases

Packages

Languages

faisalghifari17/customer-churn-prediction

Folders and files

Latest commit

History

Repository files navigation

Customer Churn Prediction

Background

Data

Process

Highlighted Key Findings

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages