Deep Learning - The Straight Dope

Abstract

This repo contains an incremental sequence of notebooks designed to teach deep learning, MXNet, and the gluon interface. Our goal is to leverage the strengths of Jupyter notebooks to present prose, graphics, equations, and code together in one place. If we're successful, the result will be a resource that could be simultaneously a book, course material, a prop for live tutorials, and a resource for plagiarising (with our blessing) useful code. To our knowledge there's no source out there that teaches either (1) the full breadth of concepts in modern deep learning or (2) interleaves an engaging textbook with runnable code. We'll find out by the end of this venture whether or not that void exists for a good reason.

Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view and are making it available for free in its entirety. While the book has a few primary authors to set the tone and shape the content, we welcome contributions from the community and hope to coauthor chapters and entire sections with experts and community members. Already we've received contributions spanning typo corrections through full working examples.

Implementation with Apache MXNet

Throughout this book, we rely upon MXNet to teach core concepts, advanced topics, and a full complement of applications. MXNet is widely used in production environments owing to its strong reputation for speed. Now with gluon, MXNet's new imperative interface (alpha), doing research in MXNet is easy.

Dependencies

To run these notebooks, you'll want to build MXNet from source. Fortunately, this is easy (especially on Linux) if you follow these instructions. You'll also want to install Jupyter and use Python 3 (because it's 2017).

Slides

The authors (& others) are inreasingly giving talks that are based on the content in this books. Some of these slide-decks (like the 6-hour KDD 2017) are gigantic so we're collecting them separately in this repo. Contribute there if you'd like to share tutorials or course material based on this books.

1 - Linear Regression (from scratch)
2 - Linear Regression (with gluon)
2.5 - Perceptron and SGD primer
3 - Multiclass Logistic Regression (from scratch)
4 - Multiclass Logistic Regression (with gluon)
5 - Overfitting and regularization (from scratch)
Roadmap L1 and L2 Regularization (in gluon)

Part 3: Deep neural networks (DNNs)

Part 6: Computer vision (CV)

Roadmap Network of networks (inception & co)
Roadmap Residual networks
Object detection
Roadmap Fully-convolutional networks
Roadmap Siamese (conjoined?) networks
Roadmap Embeddings (pairwise and triplet losses)
Roadmap Inceptionism / visualizing feature detectors
Roadmap Style transfer
Fine-tuning

Part 7: Natural language processing (NLP)

Roadmap Word embeddings (Word2Vec)
Roadmap Sentence embeddings (SkipThought)
Roadmap Sentiment analysis
Roadmap Sequence-to-sequence learning (machine translation)
Roadmap Sequence transduction with attention (machine translation)
Roadmap Named entity recognition
Roadmap Image captioning
Tree-LSTM for semantic relatedness

Part 8: Unsupervised Learning

Roadmap Introduction to autoencoders
Roadmap Convolutional autoencoders (introduce upconvolution)
Roadmap Denoising autoencoders
Roadmap Variational autoencoders
Roadmap Clustering

Part 9: Adversarial learning

Roadmap Two Sample Tests
Roadmap Finding adversarial examples
Roadmap Adversarial training

Part 10: Generative adversarial networks (GANs)

1 - Introduction to GANs
Roadmap DCGAN
Roadmap Wasserstein-GANs
Roadmap Energy-based GANS
Roadmap Conditional GANs
Roadmap Image transduction GANs (Pix2Pix)
Roadmap Learning from Synthetic and Unsupervised Images

Part 11: Deep reinforcement learning (DRL)

Roadmap Introduction to reinforcement learning
Roadmap Deep contextual bandits
Roadmap Deep Q-networks
Roadmap Policy gradient
Roadmap Actor-critic gradient

Part 12: Variational methods and uncertainty

Roadmap Dropout-based uncertainty estimation (BALD)
Roadmap Weight uncertainty (Bayes-by-backprop)
Roadmap Variational autoencoders

Part 13: Optimization

1 - Introduction
2 - Gradient descent and stochastic gradient descent
Roadmap Momentum
Roadmap AdaGrad
Roadmap RMSProp
Roadmap Adam
Roadmap AdaDelta
Roadmap SGLD / SGHNT

Part 14: Optimization, Distributed and high-performance learning

Roadmap Distributed optimization (Asynchronous SGD, ...)
Training with Multiple GPUs
Fast & flexible: combining imperative & symbolic nets with HybridBlocks
Roadmap Training with Multiple Machines
Roadmap Combining imperative deep learning with symbolic graphs

Part 15: Hacking MXNet

Custom Operators
...

Part 16: Audio Processing

Roadmap Intro to automatic speech recognition
Roadmap Connectionist temporal classification (CSC) for unaligned sequences
Roadmap Combining static and sequential data

Part 17: Recommender systems

Roadmap Latent factor models
Roadmap Deep latent factor models
Roadmap Bilinear models
Roadmap Learning from implicit feedback

Part 18: Time series

Roadmap Forecasting
Roadmap Modeling missing data
Roadmap Combining static and sequential data

Part 19 Tensor Methods

Roadmap Introduction to tensor algebra
Roadmap Tensor decomposition
Roadmap Tensorized neural networks

Appendix 1: Cheatsheets

Roadmap gluon
Roadmap PyTorch to MXNet
Roadmap Tensorflow to MXNet
Roadmap Keras to MXNet
Roadmap Math to MXNet

Choose your own adventure

We've designed these tutorials so that you can traverse the curriculum in more than one way.

Anarchist - Choose whatever you want to read, whenever you want to read it.
Imperialist - Proceed through all tutorials in order. In this fashion you will be exposed to each model first from scratch, writing all the code ourselves but for the basic linear algebra primitives and automatic differentiation.
Capitalist - If you don't care how things work (or already know) and just want to see working code in gluon, you can skip (from scratch!) tutorials and go straight to the production-like code using the high-level gluon front end.

Authors

This evolving creature is a collaborative effort (see contributors tab). The lead writers, assimilators, and coders include:

Zachary C. Lipton (@zackchase)
Mu Li (@mli)
Alex Smola (@smolix)
Sheng Zha (@szha)
Aston Zhang (@astonzhang)
Joshua Z. Zhang (@zhreshold)
Eric Junyuan Xie (@piiswrong)

Inspiration

In creating these tutorials, we've have drawn inspiration from some the resources that allowed us to learn deep / machine learning with other libraries in the past. These include:

Contribute

Already, in the short time this project has been off the ground, we've gotten some helpful PRs from the community with pedagogical suggestions, typo corrections, and other useful fixes. If you're inclined, please contribute!

Name		Name	Last commit message	Last commit date
Latest commit History 695 Commits
data/nlp		data/nlp
docs		docs
img		img
media		media
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
P01-C00-preface.ipynb		P01-C00-preface.ipynb
P01-C01-introduction.ipynb		P01-C01-introduction.ipynb
P01-C02-ndarray.ipynb		P01-C02-ndarray.ipynb
P01-C03-linear-algebra.ipynb		P01-C03-linear-algebra.ipynb
P01-C04-probability.ipynb		P01-C04-probability.ipynb
P01-C05-autograd.ipynb		P01-C05-autograd.ipynb
P02-C01-linear-regression-scratch.ipynb		P02-C01-linear-regression-scratch.ipynb
P02-C02-linear-regression-gluon.ipynb		P02-C02-linear-regression-gluon.ipynb
P02-C02.5-perceptron.ipynb		P02-C02.5-perceptron.ipynb
P02-C02.6-loss.ipynb		P02-C02.6-loss.ipynb
P02-C03-softmax-regression-scratch.ipynb		P02-C03-softmax-regression-scratch.ipynb
P02-C04-softmax-regression-gluon.ipynb		P02-C04-softmax-regression-gluon.ipynb
P02-C05-regularization-scratch.ipynb		P02-C05-regularization-scratch.ipynb
P02-C06-regularization-gluon.ipynb		P02-C06-regularization-gluon.ipynb
P02-C07-environment.ipynb		P02-C07-environment.ipynb
P03-C01-mlp-scratch.ipynb		P03-C01-mlp-scratch.ipynb
P03-C02-mlp-gluon.ipynb		P03-C02-mlp-gluon.ipynb
P03-C03-mlp-dropout-scratch.ipynb		P03-C03-mlp-dropout-scratch.ipynb
P03-C04-mlp-dropout-gluon.ipynb		P03-C04-mlp-dropout-gluon.ipynb
P03-C05-mlp-batch-norm-scratch.ipynb		P03-C05-mlp-batch-norm-scratch.ipynb
P03.5-C01-plumbing.ipynb		P03.5-C01-plumbing.ipynb
P03.5-C02-custom-layer.ipynb		P03.5-C02-custom-layer.ipynb
P03.5-C03-serialization.ipynb		P03.5-C03-serialization.ipynb
P04-C01-cnn-scratch.ipynb		P04-C01-cnn-scratch.ipynb
P04-C02-cnn-gluon.ipynb		P04-C02-cnn-gluon.ipynb
P04-C03-deep-cnns-alexnet.ipynb		P04-C03-deep-cnns-alexnet.ipynb
P04-C04-very-deep-nets-vgg.ipynb		P04-C04-very-deep-nets-vgg.ipynb
P05-C01-simple-rnn.ipynb		P05-C01-simple-rnn.ipynb
P05-C02-lstm-scratch.ipynb		P05-C02-lstm-scratch.ipynb
P05-C03-gru-scratch.ipynb		P05-C03-gru-scratch.ipynb
P05-C04-rnns-gluon.ipynb		P05-C04-rnns-gluon.ipynb
P06-C03-object-detection.ipynb		P06-C03-object-detection.ipynb
P06-C09-fine-tuning.ipynb		P06-C09-fine-tuning.ipynb
P07-C08-tree-lstm.ipynb		P07-C08-tree-lstm.ipynb
P10-C01-gan-intro.ipynb		P10-C01-gan-intro.ipynb
P10-C02-dcgan.ipynb		P10-C02-dcgan.ipynb
P13-C01-intro.ipynb		P13-C01-intro.ipynb
P13-C02-gd-and-sgd.ipynb		P13-C02-gd-and-sgd.ipynb
P14-C02-multiple-gpus-scratch.ipynb		P14-C02-multiple-gpus-scratch.ipynb
P14-C03-multiple-gpus-gluon.ipynb		P14-C03-multiple-gpus-gluon.ipynb
P14-C04-training-with-multi-machines.ipynb		P14-C04-training-with-multi-machines.ipynb
P14-C05-hybridize.ipynb		P14-C05-hybridize.ipynb
README.md		README.md
conf.py		conf.py
environment.yml		environment.yml
index.rst		index.rst
proto-P02-C02.6-loss.ipynb		proto-P02-C02.6-loss.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning - The Straight Dope

Abstract

Implementation with Apache MXNet

Dependencies

Slides

Table of contents

Part 1: Crashcourse

Part 2: Introduction to Supervised Learning

Part 3: Deep neural networks (DNNs)

Part 3.5: `gluon` Plumbing

Part 4: Convolutional neural networks (CNNs)

Part 5: Recurrent neural networks (RNNs)

Part 6: Computer vision (CV)

Part 7: Natural language processing (NLP)

Part 8: Unsupervised Learning

Part 9: Adversarial learning

Part 10: Generative adversarial networks (GANs)

Part 11: Deep reinforcement learning (DRL)

Part 12: Variational methods and uncertainty

Part 13: Optimization

Part 14: Optimization, Distributed and high-performance learning

Part 15: Hacking MXNet

Part 16: Audio Processing

Part 17: Recommender systems

Part 18: Time series

Part 19 Tensor Methods

Appendix 1: Cheatsheets

Choose your own adventure

Authors

Inspiration

Contribute

About

Releases

Packages

Languages

stgapr/mxnet-the-straight-dope

Folders and files

Latest commit

History

Repository files navigation

Deep Learning - The Straight Dope

Abstract

Implementation with Apache MXNet

Dependencies

Slides

Table of contents

Part 1: Crashcourse

Part 2: Introduction to Supervised Learning

Part 3: Deep neural networks (DNNs)

Part 3.5: gluon Plumbing

Part 4: Convolutional neural networks (CNNs)

Part 5: Recurrent neural networks (RNNs)

Part 6: Computer vision (CV)

Part 7: Natural language processing (NLP)

Part 8: Unsupervised Learning

Part 9: Adversarial learning

Part 10: Generative adversarial networks (GANs)

Part 11: Deep reinforcement learning (DRL)

Part 12: Variational methods and uncertainty

Part 13: Optimization

Part 14: Optimization, Distributed and high-performance learning

Part 15: Hacking MXNet

Part 16: Audio Processing

Part 17: Recommender systems

Part 18: Time series

Part 19 Tensor Methods

Appendix 1: Cheatsheets

Choose your own adventure

Authors

Inspiration

Contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Part 3.5: `gluon` Plumbing

Packages