Skip to content

Latest commit

 

History

History
153 lines (122 loc) · 15.1 KB

File metadata and controls

153 lines (122 loc) · 15.1 KB

Bayesian Neural Networks Reading List

This is a reading list on Bayesian neural networks. The list is quite opinionated in focusing mainly on the case where priors and posteriors are defined over the neural network weights, and the predictive task is classification. In this setting, Bayesian neural networks might be preferable because they either improve test accuracy, test calibration or both. I first include a list of essential papers, and then organize papers by subject. The aim is to create a guide for new researchers in Bayesian Deep Learning, that will speed up their entry to the field.

Interested in a more detailed discussion? Check out our recent review paper 🚨"A Primer on Bayesian Neural Networks: Review and Debates" joint work with Julyan Arbel (Inria Grenoble Rhône-Alpes), Mariia Vladimirova (Criteo), Vincent Fortuin (Helmholtz AI)🚨.

⚠️ Essential reads

  • [Weight Uncertainty in Neural Networks]: A main challenge in Bayesian neural networks is how to obtain gradients for parametric distributions such as the Gaussian. This is one of the first papers that discusses Variational Inference using the reparametrization trick for realistic neural networks. The reparametrization trick allows obtaining gradients using Monte Carlo sampling from the posterior.

  • [Laplace Redux -- Effortless Bayesian Deep Learning]: The Laplace approximation is one of the few realistic options to perform approximate inference for Bayesian Neural Networks. Not only does it result in good uncertainty estimates, but it can also be used for model selection and invariance learning.

  • [How Good is the Bayes Posterior in Deep Neural Networks Really?]: This paper describes a major criticism of Bayesian Deep Learning, that for the metrics of accuracy and negative log-likelihood, a deterministic network is often better than a Bayesian one. At the same time it describes two common tricks for efficient MCMC approximate inference, Preconditioned Stochastic Gradient Langevin Dynamics and Cyclical Step Sizing.

  • [What Are Bayesian Neural Network Posteriors Really Like?]: This paper implements Hamiltonian Monte Carlo (HMC) for approximate inference in Bayesian Deep Neural Networks. HMC is considered the gold standard in approximate inference, however it is very computationally intensive.

  • [Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles]: This paper implements deep neural network ensembles. This is a Frequentist alternative to Bayesian Neural Networks. It is one of the most common baselines for Bayesian Neural Networks, and frequently outperforms them.

  • [The Bayesian Learning Rule]: Many machine-learning algorithms are specific instances of a single algorithm called the Bayesian learning rule. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models.

🔥 Interesting recent reads!

Approximate Inference

The Bayesian paradigm consists of choosing a prior over the model parameters, evaluating the data likelihood and the estimating the posterior over the model parameters. This can be done analytically only in simple cases (Gaussian likelihood, prior and posterior). For more complex and interesting cases we have to resort to approximate inference.

Variational Inference

++ Computationally efficient --Explores a single mode of the loss

Laplace approximation

++ Computationally efficient --Explores a single mode of the loss

Sampling methods

++ Explore multiple modes --Computationally expensive

Deep Ensembles

++ Explore multiple modes, Computationally competitive --Memory expensive

Performance Certificates

Bayesian neural networks are compatible with two approaches to model selection that eschew validation and test sets and can (in principle) give guarantees on out-of-sample performance simply by using the training set.

Marginal Likelihood

The marginal likelihood is a purely Bayesian approach to model selection.

PAC-Bayes

PAC-Bayes bounds give high-probability Frequentist guarantees on out-of-sample performance.

Benchmarking

Bayesian Neural Networks aim to improve test accuracy and more commonly test calibration. While Bayesian Neural Networks are often tested on standard computer vision datasets such as CIFAR10/100 with test accuracy as the test metric, dedicated datasets and metrics can also be used.

Datasets

Metrics

Review papers

📝 Citation

Did you find this reading list helpful? Consider citing our review paper on your scientific publications using the following BibTeX citation:

@article{arbel2023primer,
  title={A Primer on Bayesian Neural Networks: Review and Debates},
  author={Arbel, Julyan and Pitas, Konstantinos and Vladimirova, Mariia and Fortuin, Vincent},
  journal={arXiv preprint arXiv:2309.16314},
  year={2023}
}

When citing this repository on any other medium, please use the following citation:

A Primer on Bayesian Neural Networks: Review and Debates by Julyan Arbel, Konstantinos Pitas, Mariia Vladimirova and Vincent Fortuin