This repo is home to the code that accompanies Probability and Statistics for Machine Learning curriculum, which provides a comprehensive overview of all of the subjects — across Probability and Statistics — that underlie contemporary machine learning approaches, including deep learning and other artificial intelligence techniques.
-
- What Probability Theory Is
- A Brief History: Frequentists vs Bayesians
- Applications of Probability to Machine Learning
- Random Variables
- Discrete vs Continuous Variables
- Probability Mass and Probability Density Functions
- Expected Value
- Measures of Central Tendency: Mean, Median, and Mode
- Quantiles: Quartiles, Deciles, and Percentiles
- The Box-and-Whisker Plot
- Measures of Dispersion: Variance, Standard Deviation, and Standard Error
- Measures of Relatedness: Covariance and Correlation
- Marginal and Conditional Probabilities
- Independence and Conditional Independence
-
- Uniform
- Gaussian: Normal and Standard Normal
- The Central Limit Theorem
- Log-Normal
- Exponential and Laplace
- Binomial and Multinomial
- Poisson
- Mixture Distributions
- Preprocessing Data for Model Input
-
- What Information Theory Is
- Self-Information
- Nats, Bits and Shannons
- Shannon and Differential Entropy
- Kullback-Leibler Divergence
- Cross-Entropy
-
- Frequentist vs Bayesian Statistics
- Review of Relevant Probability Theory
- z-scores and Outliers
- p-values
- Comparing Means with t-tests
- Confidence Intervals
- ANOVA: Analysis of Variance
- Pearson Correlation Coefficient
- R-Squared Coefficient of Determination
- Correlation vs Causation
- Correcting for Multiple Comparisons
-
- Features: Independent vs Dependent Variables
- Linear Regression to Predict Continuous Values
- Fitting a Line to Points on a Cartesian Plane
- Ordinary Least Squares
- Logistic Regression to Predict Categories
-
- (Deep) ML vs Frequentist Statistics
- When to use Bayesian Statistics
- Prior Probabilities
- Bayes’ Theorem
- PyMC3 Notebook
- Resources for Further Study of Probability and Statistics