Skip to content

demonstrate different models such as Variational Autoencoders and GANs in a variety of datasets, including tabular, text and image data, including the generation of synthetic data for comparison of their effectiveness in all models for each kind of dataset

Notifications You must be signed in to change notification settings

SophiaVei/systematic-study-on-class-imbalance

 
 

Repository files navigation

A Systematic Study of the Class Imbalance Problem in the Era of Deep Learning

The variety of the datasets include tabular, text and image data.

Tabular datasets:

  • Arrhythmia
  • Mammography

Models:

  • VAE
  • SMOTE
  • Borderline-SMOTE
  • Random Oversampling

classifiers:

  • Gaussian NB
  • Logistic Regression
  • SVM

Metrics:

  • G-mean
  • F1

Text datasets:

  • dataset consisting of tweets in Greek concerning public transport and cycling was used, retrieved from Twitter using a list of relevant keywords.
    keywords: λεωφορείο, λεωφορειόδρομος, μετρό, τραμ, ΟΑΣΘ, ΟΑΣΑ, Μέσα Μαζικής Μεταφοράς, ΜΜΜ, Δημόσιες συγκοινωνίες, ποδήλατο, ποδηλατόδρομος, ποδηλάτης, πεζός, πεζοδρόμιο, κυκλοφοριακή συμφόρηση, μποτιλιάρισμα, Βιώσιμη Αστική Κινητικότητα, μεγάλος περίπατος, μέσα μεταφοράς

Models:

  • VAE
  • SMOTE
  • Borderline-SMOTE
  • Random Oversampling

classifiers:

  • Gaussian NB
  • Logistic Regression
  • SVM

Metrics:

  • G-mean
  • F1

Image datasets:

  • MNIST (Modified National Institute of Standards and Technology) dataset

Models:

  • VAE
  • GAN

classifiers:

  • Random Forest

Metrics:

  • Precision
  • Recall
  • F1

About

demonstrate different models such as Variational Autoencoders and GANs in a variety of datasets, including tabular, text and image data, including the generation of synthetic data for comparison of their effectiveness in all models for each kind of dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%