Skip to content

Latest commit

 

History

History
184 lines (172 loc) · 6.98 KB

README.md

File metadata and controls

184 lines (172 loc) · 6.98 KB

ADS-503 Applied Predictive Analytics Project

Financial Datasets For Fraud Detection

Leonard Littleton
Lina Nguyen
Emanuel Lucban


Downloads

Dataset for the project can be downloaded here
Kaggle - Synthetic Financial Datasets for Fraud Detection

RDS files for all trained models can be downloaded here
Note: zip file is 30GB
rds_files.zip


Models Tested

  • Quadratic Discriminant Analysis
  • Support Vector Machines
  • Logistic Regression
  • Gradient Boosting Machine
  • Linear Discriminant Analysis
  • Nearest Shrunken Centroids
  • Neural Networks
  • Partial Least Squares Discriminant Analysis

Model Performance

ROC Curves

Metrics

Models AUC Sensitivity Specificity Pos.Pred.Value Neg.Pred.Value Precision Recall F1 Prevalence Detection.Rate Detection.Prevalence Balanced.Accuracy
Quadratic Discriminant Analysis 0.9927164 0.8594569 0.9937371 0.1534927 0.9998132 0.1534927 0.8594569 0.2604678 0.0013196 0.0011341 0.0073888 0.9265970
Support Vector Machines 0.9978020 0.7617913 0.9999610 0.9626731 0.9996853 0.9626731 0.7617913 0.8505319 0.0013196 0.0010052 0.0010442 0.8808762
Logistic Regression 0.9964422 0.7308242 0.9999553 0.9557632 0.9996444 0.9557632 0.7308242 0.8282937 0.0013196 0.0009644 0.0010090 0.8653898
Gradient Boosting Machine 0.9777450 0.6279181 0.9999465 0.9394155 0.9995086 0.9394155 0.6279181 0.7527127 0.0013196 0.0008286 0.0008820 0.8139323
Linear Discriminant Analysis 0.9621525 0.4030491 0.9999037 0.8468468 0.9992118 0.8468468 0.4030491 0.5461588 0.0013196 0.0005319 0.0006280 0.7014764
Nearest Shrunken Centroids 0.9362662 0.0724154 0.9987977 0.0737148 0.9987744 0.0737148 0.0724154 0.0730594 0.0013196 0.0000956 0.0012963 0.5356065
Neural Network 0.5000000 0.0000000 1.0000000 NaN 0.9986804 NA 0.0000000 NA 0.0013196 0.0000000 0.0000000 0.5000000
PLS Discriminant Analysis 0.9771492 0.0000000 1.0000000 NaN 0.9986804 NA 0.0000000 NA 0.0013196 0.0000000 0.0000000 0.5000000

References

Lopez-Rojas, Edgar. (2017). Synthetic Financial Datasets For Fraud Detection. Kaggle. https://www.kaggle.com/ealaxi/paysim1.