Skip to content

How to do a simple end-to-end machine learning classification project using the telco churn dataset

Notifications You must be signed in to change notification settings

Azie88/Machine-Learning-Classification-Review

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine-Learning-Classification-Review 🤷‍♂️

How to do a simple end-to-end machine learning classification project using the telco customer churn dataset.

In machine learning, classification is a supervised method of segmenting data points into various labels or classes. Unlike regression, the target variable in a classification problem is discrete. Each data point used in training classification models must have a corresponding label in order for the characteristics and patterns in the classes to be learnt appropriately. Classification can either be binary - identifying that a given email is spam or not or, multi-class - classifying a fruit as orange, mango or banana.

This project is a binary classification problem.

Dataset 💾

Kaggle Link

Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

  • Customers who left within the last month – the column is called Churn
  • Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
  • Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
  • Demographic info about customers – gender, age range, and if they have partners and dependents

How to Use The Repository

You need to have Python 3 on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...

  1. Clone this repository: git clone https://github.com/Azie88/Machine-Learning-Classification-Review
  2. On your IDE, create A Virtual Environment and Install the required packages for the project:
  • Windows:

      python -m venv venv; 
      venv\Scripts\activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    
  • Linux & MacOs:

      python3 -m venv venv; 
      source venv/bin/activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    

The two long command-lines have the same structure. They pipe multiple commands using the symbol ; but you can manually execute them one after the other.

  • Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
  • Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
  • Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
  • Install the required libraries/packages listed in the requirements.txt file so that they can be imported into the python script and notebook without any issue.

NB: For MacOs users, please install Xcode if you have an issue.

  1. Explore the Jupyter notebook for detailed steps and code execution.

Author ✍️

Andrew Obando

Andrew Obando | LinkedIn Medium


Feel free to star ⭐ this repository if you find it helpful!

About

How to do a simple end-to-end machine learning classification project using the telco churn dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published