Skip to content

anilesh-prajapati/Probability-and-Statistics-for-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Probability and Statistics for Machine Learning

This repo is home to the code that accompanies Probability and Statistics for Machine Learning curriculum, which provides a comprehensive overview of all of the subjects — across Probability and Statistics — that underlie contemporary machine learning approaches, including deep learning and other artificial intelligence techniques.

Probability Curriculum

  • Segment 1: Introduction to Probability

    • What Probability Theory Is
    • A Brief History: Frequentists vs Bayesians
    • Applications of Probability to Machine Learning
    • Random Variables
    • Discrete vs Continuous Variables
    • Probability Mass and Probability Density Functions
    • Expected Value
    • Measures of Central Tendency: Mean, Median, and Mode
    • Quantiles: Quartiles, Deciles, and Percentiles
    • The Box-and-Whisker Plot
    • Measures of Dispersion: Variance, Standard Deviation, and Standard Error
    • Measures of Relatedness: Covariance and Correlation
    • Marginal and Conditional Probabilities
    • Independence and Conditional Independence
  • Segment 2: Distributions in Machine Learning

    • Uniform
    • Gaussian: Normal and Standard Normal
    • The Central Limit Theorem
    • Log-Normal
    • Exponential and Laplace
    • Binomial and Multinomial
    • Poisson
    • Mixture Distributions
    • Preprocessing Data for Model Input
  • Segment 3: Information Theory

    • What Information Theory Is
    • Self-Information
    • Nats, Bits and Shannons
    • Shannon and Differential Entropy
    • Kullback-Leibler Divergence
    • Cross-Entropy

Statistics Curriculum

  • Segment 1: Frequentist Statistics

    • Frequentist vs Bayesian Statistics
    • Review of Relevant Probability Theory
    • z-scores and Outliers
    • p-values
    • Comparing Means with t-tests
    • Confidence Intervals
    • ANOVA: Analysis of Variance
    • Pearson Correlation Coefficient
    • R-Squared Coefficient of Determination
    • Correlation vs Causation
    • Correcting for Multiple Comparisons
  • Segment 2: Regression

    • Features: Independent vs Dependent Variables
    • Linear Regression to Predict Continuous Values
    • Fitting a Line to Points on a Cartesian Plane
    • Ordinary Least Squares
    • Logistic Regression to Predict Categories
  • Segment 3: Bayesian Statistics

    • (Deep) ML vs Frequentist Statistics
    • When to use Bayesian Statistics
    • Prior Probabilities
    • Bayes’ Theorem
    • PyMC3 Notebook
    • Resources for Further Study of Probability and Statistics