Skip to content

This project focuses on developing Machine Learning (ML) models to predict loan eligibility, which is vital in accelerating the decision-making process and determining if an applicant gets a loan or not.

Notifications You must be signed in to change notification settings

emmanguyen0602/Bank-Loan-Eligibility-Prediction

Repository files navigation

💰 Bank Loan Eligibility Prediction

loan-bank

TABLE OF CONTENTS

Introduction

The loan approval process is a challenging task for any financial institution. Before giving credit loans to borrowers, the bank decides whether the borrower is bad (defaulter) or good (non-defaulter). This project focuses on developing Machine Learning (ML) models to predict loan eligibility, which is vital in accelerating the decision-making process and determining if an applicant gets a loan or not.

Dream Housing Finance company deals in all home loans. They have a presence across all urban, semi-urban, and rural areas. Customer-first applies for a home loan after that company validates the customer eligibility for a loan.

The company wants to automate the loan eligibility process (real-time) based on customer detail provided while filling the online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History, and others. To automate this process, they have given a problem to identify the customer's segments, those are eligible for loan amount so that they can specifically target these customers.

Objectives

  • Analyze customer data provided in data set (EDA)

  • Build various ML models that can predict loan approval

Tools

Task Technique Tools/Packages Used
Data Collection Using dataset available in Kaggle
Data Cleaning Drop unwanted columns, add new columns, deal with missing values pandas
Data Visualization Multi-attribute plots matplotlib, seaborn
Data Preprocessing Feature Encoding, Feature Engineering (deal with ouliersa and imbalanced data), Feature Scaling (Normalization data) sklearn (LabelEncoder, SMOTE, MinMaxScaler), pandas (get_dummies), numpy(log)
Data Modeling Supervised Machine Learning Models using Logistic Regression and Random Forest sklearn
Environments & Platforms Jupyter Notebook, Kaggle

Results

Below are some key insights that were generated as a result of exploratory data analysis (EDA).

  • The one whose salary is more can have a greater chance of loan approval.
  • The one who is graduate has a better chance of loan approval.
  • Married people would have a upper hand than unmarried people for loan approval .
  • The applicant who has less number of dependents have a high probability for loan approval.
  • The lesser the loan amount the higher the chance for getting loan.
  • Better credit history will have the higher chance of loan approval.

Below are the machine learning models used for predicting whether a bank loan is approved or not.

Machine Learning Models Accuracy Precision Recall AUC Score
1. Logistic Regression 0.83 0.81 0.98 0.72
2. Random Forest Classifier 0.79 0.81 0.92 0.75

About

This project focuses on developing Machine Learning (ML) models to predict loan eligibility, which is vital in accelerating the decision-making process and determining if an applicant gets a loan or not.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published