Notes and labs from the CT0429 - Predictivie Analytics Course at Ca' Foscari University for the year 2022/2023, by Professor Prosdocimi Ilaria.
All the credits go to the professor Prosdocimi Ilaria, the creators and owners of the resources used in this course.
I tried to assemble some formulas and their translation to R, obviously there are many ways to achieve the same results (depending on the given input).
Formulas Handbook - Not finished
For this exams we are allowed to use a .R script with some comments and formulas.
Some of the notes are not done since I recently switch from taking notes on my portable devices, to my laptop. You should check out this folder regarding notes about SLR and MLR (Students of Ca'Foscari University of Venice only).
- Basics of Probability and Statistics
- Simple Linear Regression
- Multiple Linear Regression
- Model Selection
- Quality Criterion
- Variable Selection
- Categorical Predictors and Interactions
- Model Checking
- Transformations
- Collinearity
- Influence
- GLM
- Classification
The labs are organized with the following hierarchy, if possible:
- Lab X, the version of the document with additional notes, which are added to the professor's solution (I suggest to follow this)
- The professor's solution for the Lab X
- The class notes from the professor's Lab X
If you want to visualize the labs without having to download and run the Rmd files each time, click on the links:
- Lab 00 - R quick revision
- Lab 01 - Intro to simple linear models in R
- Lab 02 - Intro to multiple linear models in R
- Lab 03 - Anova lab
- Lab 04 - Model selection lab
- Lab 05 - Intro to categorical predictors
- Lab 06 - Multicollinearity and influential points
- Lab 07 - Assignment with Solution - Transformations and model selection
- Lab 08 - Intro to GLM
- Lab 09 - More about GLM
- Lab 10 - Spam Detection
- Lab 00 - Revisione di R e predittori ottimali
- Lab 01 - Professor's Solution - R e Regressione Lineare Semplice
- Lab 02 - Professor's Solution - Linear regression in R
- Lab 03 - Professor's Solution - More on Linear regression and simulation
- SLR Exercise - Solution - Esercizio con Soluzione
- Lab 04 - Professor's Solution - Multiple Linear Regression
- Lab 05 - Professor's Solution - Multiple Linear Regression, model assessment
- Lab 06 - Professor's Solution - Model selection
- Lab 07 - Professor's Solution - Categorical variables
- Lab 08 - Professor's Solution - Things that can go wrong
- Lab 09 - Professor's Solution - Transformation, model selection and estimation
- Lab 10 - Professor's Solution - GLMs in R
- Lab 11 - Professor's Solution - More about GLMs
- Lab 12 - Professor's Solution - Classification for binary classifiers
The course's page and the slides specify the material used already, but I will try to cite it wherever possible.
- Julian J. Faraway, 2014. Linear Models with R Second Edition, Chapman and Hall/CRC
- Julian J. Faraway, 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, Second Edition Chapman and Hall/CRC
- James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer
- David Dalpiaz - Applied Statistics with R!
- Julian J. Faraway - Practical Regression and Anova using R
- Cosma Shalizi - A draft textbook on data analysis methods
- Rafael A. Irizarry - Introduction to Data Science
- Hastie, Tibshirani and Friedmam - The elements of statistical learning
Maintained by PayThePizzo
Shoutout to 2 crazy helpful Latex-related resources:
I take no credit from any of the material that is included, nor I assure the correctness of my notes since I am not a statistics professor.
All the material cited is intended for personal use and I highly recommend purchasing the textbooks cited.
Please contact me as soon as possible, if you feel like:
- Your work has not been cited correctly, or you want me to remove it
- There are some errors I should correct, or unclear sections.
PS: Sorry but some parts are in italian as i was trying to jot down as much as I could.