Skip to content

sabeesh90/Multimodal_Deep_Learning_DLDC_2021

Repository files navigation

MULTIMODAL DEEP LEARNING - TENSORFLOW / KERAS

DLDC presentation.pptx

This repository contains code pertaining to multimodal deep learning classification performed on the SDSS-4 DR-16 dataset with a SOTA accuracy of 99%. The Sloan Digital Sky Survey captures spectroscopic and photmetric information about the various astronomical bodies such as galaxies, stars and quasars. The dataset consists of both tabular and image data which have been downloaded from the SDSS website. http://skyserver.sdss.org/dr16/en/tools/search/sql.aspx. There are 6 features and 1 target variable for the tabular data. The target variable is a 3 class variable. There are a total of 1000 datapoints. The distribution of the images along with the images of the astronomical bodies (Galaxies, stars and Quasars - In order) is shown below.

CONVENTIONAL MACHINE LEARNING

Machine learning models have been built using scikit learn library and the pycaret distribution. Tree based classifiers and Gradient Boosting algorithms show higher evaluation metrics than other algorithms but however their performance on class imbalance was not satisfactory. ANNS show better performance on class imbalance and even have higher model metrics. These deep networks are superior to other ML architectures on tabular data.

UNIMODAL DEEP LEARNING

The following were observed on unimodal deep learning algorithms 1) Newer architecture better than older architectures on ‘minimal fine tuning’
2) Overfitting / underfitting on older architectures
3) Steeper curve
4) No significant train test gap Higher Evaluation metrics
5) Higher Accuracies in minimal epochs
The class wise scores however showed a decline in the Quasar class

MULTIMODAL DEEP LEARNING

Multimodal deep network has been built by combining tabular data and image data using the functional API of keras. The following was inferred.

The class wise metrics were aso superior in mnultimodal deep learning with no effect of class imbalance on the model performance.

The following are the findings of the architecture
1) Curves of even older architectures improves in multimodality
2) EfficientNetB2 and Xception has steepest curves - (better than unimodal deep learning)
3) Highest accuracies at minimal number of epochs (better than unimodal deep learning)
4) Perfectly fitting model – Train test gap – least

TRAINING CURVES

The training curves below show that the multimodal curves show better fitting than the unimodal curves.

unimodal dl mm dl

SOTA ACCURACY

We were able to achieve a SOTA accuracy of 99% on this dataset using multimodal deep learning. Similar SOTA results were also obtained using Customized CNN. The details of the same are in the Colab Notebook link provided in this repository. the findings were published in the DLDC Summit - 2021 as a part of the publication in the Lattice journal

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published