Skip to content

Latest commit

 

History

History
14 lines (8 loc) · 1022 Bytes

README.md

File metadata and controls

14 lines (8 loc) · 1022 Bytes

Handling Imbalanced Datasets with SMOTE Variants

Overview

This project explores the application of advanced SMOTE (Synthetic Minority Over-sampling Technique) algorithms—Borderline-SMOTE, Borderline-SMOTE2, and CURE SMOTE—to effectively address classification challenges in imbalanced datasets. By implementing machine learning models such as Random Forest and K-Nearest Neighbors, this study evaluates the performance improvements achieved through these oversampling techniques across three diverse datasets:

  • Mammography Dataset
  • Credit Card Fraud Dataset
  • ParkourMaker Dataset

Results

This project achieved the highest grade of 1.0 for its comprehensive approach to handling imbalanced datasets and the effective implementation of SMOTE variants. The application of CURE SMOTE, in particular, demonstrated significant improvements in minority class prediction accuracy, especially within the credit card fraud dataset when combined with the Random Forest algorithm.