Primers • Ilya Sutskever's Top 30

Ilya Sutskever’s Top 30 Reading List

The First Law of Complexodynamics
The Unreasonable Effectiveness of Recurrent Neural Networks
Understanding LSTM Networks
Recurrent Neural Network Regularization
Keeping Neural Networks Simple by Minimizing the Description Length of the Weights
Pointer Networks
ImageNet Classification with Deep Convolutional Neural Networks
Order Matters: Sequence to Sequence for Sets
GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism
Deep Residual Learning for Image Recognition
Multi-Scale Context Aggregation by Dilated Convolutions
Neural Message Passing for Quantum Chemistry
Attention is All You Need
Neural Machine Translation by Jointly Learning to Align and Translate
Identity Mappings in Deep Residual Networks
A Simple Neural Network Module for Relational Reasoning
Variational Lossy Autoencoder
Relational Recurrent Neural Networks
Quantifying the Rise and Fall of Complexity in Closed Systems: the Coffee Automaton
Neural Turing Machines
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Scaling Laws for Neural Language Models
A Tutorial Introduction to the Minimum Description Length Principle
Machine Super Intelligence
Kolmogorov Complexity and Algorithmic Randomness
Stanford’s CS231n Convolutional Neural Networks for Visual Recognition

Meta

Better & Faster Large Language Models Via Multi-token Prediction
- Key Takeaways:
Dense Passage Retrieval for Open-Domain Question Answering
- Dense Passage Retriever (DPR):
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

HuggingFace

Zephyr: Direct Distillation of LM Alignment

Stanford

Lost in the Middle: How Language Models Use Long Contexts