Mosaic ML implementation

This paper was presented in MLDS 2022 MLDC_Jan_2022.pdf

MLDS 2022_Presentation_Final.pptx

Authors: Sabeesh Ethiraj, Bharath Kumar Bolla

SOTA Accuracies on MNIST and CIFAR-10

This paper covers advanced topics on making deep neural networks more efficient and robust by enhancing architectural efficiency, optimization, label manipulations, and learning rate techniques.

Architectural Efficiency

Depth wise separable convolutions - Parameter Reduction
Global average pooling - Paramter reduction
Blurpool - Anti Aliasing
Squeeze and Excite - Channel Attention

Depthwise Separable Convolutions	Squeeze and Excitation Blocks	Blurpool

Weight Space Altlerations

Stochastic Weight Averaging

Optimization & Regularization

Sharpness Aware minimization
Label Smoothing
One Cycle LR

Sharpness Aware minimization

Augmentations

Mixup
Cutout

Depthwise Separable Convolutions	Squeeze and Excitation Blocks

Baseline models have been built by progressively reducing the number of parameters using Depth wise convolution and GAP both in case of MNIST and CIFAR-10.

MNIST Architectures

CIFAR-10 Architectures

Efficiency of DW convolutions / GAP on Accuracy / inference time

SOTA Accuracy of 98.35% with 1.5K params on MNIST dataset
Accuracy of 79.9% with 140K params on CIFAR-10 dataset
No direct effect on DW convs on latency. DW models with same number of params perform SLOWER than models with 3x3 convs.
inference time is proportional to the number of parameters. DW models with lesser params show a decrease in inference time than models with higher params

DW on Accuracy	DW on inference time

Efficiency of Mosaic ML techniques

Blurpool is the most efficient technique with SOTA 99.21% with 1.5K params / Combination resulted in no significant increase in accuracy
Combination of BP + CO + M + LS + SWA + SAM resulted in a SOTA accuracy of 86.76% with 140K params (6.865 increase) / In isolation Mixup performed better with 3.62% increase to 83.52%
No direct effect on DW convs on latency. DW models with same number of params perform SLOWER than models with 3x3 convs.
inference time is proportional to the number of parameters. DW models with lesser params show a decrease in inference time than models with higher params

Isolated techniques on Accuracy	Combined techniques on Accuracy

Take Home!

These techniques may be applied on other standard and custom datasets to establish the superioirty of these model enhancement techniques.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Mosaic ML implementation

This paper was presented in MLDS 2022 MLDC_Jan_2022.pdf

Authors: Sabeesh Ethiraj, Bharath Kumar Bolla

SOTA Accuracies on MNIST and CIFAR-10

Architectural Efficiency

Weight Space Altlerations

Optimization & Regularization

Augmentations

MNIST Architectures

CIFAR-10 Architectures

Efficiency of DW convolutions / GAP on Accuracy / inference time

Efficiency of Mosaic ML techniques

Take Home!

Files

README.md

Latest commit

History

README.md

File metadata and controls

Mosaic ML implementation

This paper was presented in MLDS 2022 MLDC_Jan_2022.pdf

Authors: Sabeesh Ethiraj, Bharath Kumar Bolla

SOTA Accuracies on MNIST and CIFAR-10

Architectural Efficiency

Weight Space Altlerations

Optimization & Regularization

Augmentations

MNIST Architectures

CIFAR-10 Architectures

Efficiency of DW convolutions / GAP on Accuracy / inference time

Efficiency of Mosaic ML techniques

Take Home!