Skip to content

Maching Learning Project : A classification problem using K Nearest Neighbor(KNN), Decision Tree, Support Vector Machine, and Logistic Regression algorithms

Notifications You must be signed in to change notification settings

zekaouinoureddine/ML-Project-Classification-With-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

ML-Project-Classification-With-Python

Maching Learning Project : A classification problem using K Nearest Neighbor(KNN), Decision Tree, Support Vector Machine, and Logistic Regression algorithms

Table of Contents:


Overview

In this Mini Project, we'll try to practice all the classification algorithms that we learned in the Machine Learning With Python course. So, we'll load a dataset using Pandas library, and apply the following algorithms, and find the best one for this specific dataset by accuracy evaluation methods.


About Dataset

This dataset is about past loans. The Loan_train.csv data set includes details of 346 customers whose loan are already paid off or defaulted. It includes following fields:

Field Description
Loan_status Whether a loan is paid off on in collection
Principal Basic principal loan amount at the
Terms Origination terms which can be weekly (7 days), biweekly, and monthly payoff schedule
Effective_date When the loan got originated and took effects
Due_date Since it’s one-time payoff schedule, each loan has one single due date
Age Age of applicant
Education Education of applicant
Gender The gender of applicant

You can download the dataset Loan_train.csv by clicking here


Algorithms and Technologies

We used:

  • Python (Pandas, seaborn, matplotlib,numpy) and the amazing ML library Scikit-learn
  • IBM Cloud (Waston Studio, Jupyter Notebook)

For building our models we will be using the folowing algorithms:

  • K Nearest Neighbor(KNN)
  • Decision Tree
  • Support Vector Machine
  • Logistic Regression

Evaluation

The report below shows the accuracies of all built models using different evaluation metrics:

Algorithm Jaccard F1-score LogLoss
KNN 0.67 0.63 NA
Decision Tree 0.76 0.77 NA
SVM 0.80 0.76 NA
Logistic Regression 0.74 0.70 0.67

Few Take Aways


References


Author Infos


Back To The Top