Skip to content

Latest commit

 

History

History
52 lines (35 loc) · 2.17 KB

README.md

File metadata and controls

52 lines (35 loc) · 2.17 KB

Transaction Data for Fraud Analysis

This repository contains code and data for fraud analysis using various machine learning algorithms. The dataset used is available on Kaggle. The analysis involves the application of the following machine learning models:

  • Logistic Regression
  • Naive Bayes
  • Decision Tree
  • Random Forest
  • XGBOOST
  • Support Vector Machine

Data Description

The dataset consists of transactional data with various features that have been used to detect fraudulent activities. The dataset can be found here.

Dependencies

Make sure you have the following dependencies installed:

  • Python 3
  • Jupyter Notebook
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • Scikit-learn

Analysis

Models Used

  1. Logistic Regression: Utilized for binary classification of fraudulent and non-fraudulent transactions.
  2. Naive Bayes: Employed for probabilistic classification, assuming independence between features.
  3. Decision Tree: Constructed to make decisions based on the features in the dataset.
  4. Random Forest: Ensemble learning method based on constructing a multitude of decision trees.
  5. XGBOOST: Gradient boosting framework that focuses on computational speed and model performance.
  6. Support Vector Machine: Used for both classification and regression analysis.

Visualization

  • Correlation Matrix: Visual representation of the correlation between different features in the dataset.
  • Accuracy Graph: Graphical representation of the accuracy achieved by various machine learning models employed in the analysis.

Results

The results of the analysis, including model performance metrics and insights derived from the visualization techniques, can be found in the accompanying Jupyter Notebooks.

Usage

You can use the provided Jupyter Notebooks to replicate the analysis. The notebooks include step-by-step instructions along with explanations for each phase of the analysis.

You can also see my work at Kaggle. You can also upvote and comment my work there.