This project aims to predict customer behavior to retain valuable customers for Orange Telecom. The goal is to analyze relevant customer data and develop focused customer retention programs. The predictive models are built using the Orange Telecom's Churn Dataset, which consists of cleaned customer activity data and a churn label specifying whether a customer canceled the subscription.
Two datasets are provided for this project:
-
churn-bigml-80.csv
- Contains 2666 rows (customers) and 20 columns (features).
- Intended for training and cross-validation purposes.
- Used for developing machine learning models.
-
churn-bigml-20.csv
- Contains 667 rows (customers) and 20 columns (features).
- Intended for final testing and model performance evaluation.
The datasets share the same attributes or features, including state, account length, area code, international plan, voice mail plan, number of voicemail messages, total day minutes, total day calls, total day charge, total evening minutes, total evening calls, total evening charge, total night minutes, total night calls, total night charge, total international minutes, total international calls, total international charge, customer service calls, and the churn label.
This project serves as an opportunity to explore predictive models for customer churn and gain insights into developing effective retention strategies. The analysis and models created here can contribute to the broader understanding of customer behavior in the telecom industry.
-
Download Datasets:
- Download the datasets from the Kaggle source: Telecom Churn Datasets.
-
Explore the Notebooks:
- Check out the provided Jupyter notebooks to understand the data exploration, preprocessing, and model development steps.
-
Train and Evaluate Models:
- Use the "churn-bigml-80" dataset for training and cross-validation.
- Employ various machine learning models to predict customer churn.
- Evaluate model performance on the "churn-bigml-20" dataset.
-
Contribute and Learn:
- Feel free to contribute to the project by enhancing existing models, exploring additional features, or suggesting improvements.
The datasets include various customer attributes, such as state, account length, and service usage details. The target variable is the "Churn" column, indicating whether a customer has canceled their subscription.
The datasets are sourced from Kaggle and can be found here.
Feel free to explore, analyze, and contribute to the project! Your insights could play a crucial role in enhancing customer retention strategies for Orange Telecom.