Leonard Littleton
Lina Nguyen
Emanuel Lucban
Dataset for the project can be downloaded here
Kaggle - Synthetic Financial Datasets for Fraud Detection
RDS files for all trained models can be downloaded here
Note: zip file is 30GB
rds_files.zip
- Quadratic Discriminant Analysis
- Support Vector Machines
- Logistic Regression
- Gradient Boosting Machine
- Linear Discriminant Analysis
- Nearest Shrunken Centroids
- Neural Networks
- Partial Least Squares Discriminant Analysis
Metrics
Models | AUC | Sensitivity | Specificity | Pos.Pred.Value | Neg.Pred.Value | Precision | Recall | F1 | Prevalence | Detection.Rate | Detection.Prevalence | Balanced.Accuracy |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Quadratic Discriminant Analysis | 0.9927164 | 0.8594569 | 0.9937371 | 0.1534927 | 0.9998132 | 0.1534927 | 0.8594569 | 0.2604678 | 0.0013196 | 0.0011341 | 0.0073888 | 0.9265970 |
Support Vector Machines | 0.9978020 | 0.7617913 | 0.9999610 | 0.9626731 | 0.9996853 | 0.9626731 | 0.7617913 | 0.8505319 | 0.0013196 | 0.0010052 | 0.0010442 | 0.8808762 |
Logistic Regression | 0.9964422 | 0.7308242 | 0.9999553 | 0.9557632 | 0.9996444 | 0.9557632 | 0.7308242 | 0.8282937 | 0.0013196 | 0.0009644 | 0.0010090 | 0.8653898 |
Gradient Boosting Machine | 0.9777450 | 0.6279181 | 0.9999465 | 0.9394155 | 0.9995086 | 0.9394155 | 0.6279181 | 0.7527127 | 0.0013196 | 0.0008286 | 0.0008820 | 0.8139323 |
Linear Discriminant Analysis | 0.9621525 | 0.4030491 | 0.9999037 | 0.8468468 | 0.9992118 | 0.8468468 | 0.4030491 | 0.5461588 | 0.0013196 | 0.0005319 | 0.0006280 | 0.7014764 |
Nearest Shrunken Centroids | 0.9362662 | 0.0724154 | 0.9987977 | 0.0737148 | 0.9987744 | 0.0737148 | 0.0724154 | 0.0730594 | 0.0013196 | 0.0000956 | 0.0012963 | 0.5356065 |
Neural Network | 0.5000000 | 0.0000000 | 1.0000000 | NaN | 0.9986804 | NA | 0.0000000 | NA | 0.0013196 | 0.0000000 | 0.0000000 | 0.5000000 |
PLS Discriminant Analysis | 0.9771492 | 0.0000000 | 1.0000000 | NaN | 0.9986804 | NA | 0.0000000 | NA | 0.0013196 | 0.0000000 | 0.0000000 | 0.5000000 |
Lopez-Rojas, Edgar. (2017). Synthetic Financial Datasets For Fraud Detection. Kaggle. https://www.kaggle.com/ealaxi/paysim1.