Modelling bacteria using exponential regression. This is done by transforming the exponential data into linear data using
- Residulas have constant variance
- Residulas are independent
- Resiudals are normally distributed
The residual analysis was performed with graphical methods, although, more sophisticated statistical tests for normality and correlation could be useful.
Next, confidence and prediction bands were calculated for the regression model as shown below. The confidence interval gives a range of possible models that could be fit depending on the sample of the population. The prediction interval gives a range of possible values for a new observation. Note the difference between the two.
Modelling the welding strenght based on the current being used for the welding process can be done with polynomial regression. In this notebook, it is done using varying degrees of polynomials. Using higher degree polynomials leads to less bias but more variance. To counteract this Tikhonov regularization is made, which ameliorates the overfitting of the polynomial to the data.
Also, confidence and perdiction bands were found as shown below.
Classification of surviving the Titanic is made with a self-impelemnted logistic regression model. Based on features of sex, class and age, the model has a ~ 75 % accuracy. A comparision with sklearn's logistic regression and kNN classifier is made. The logistic regression results are nearly identical (sklearn regularizeses too), and the results are better than kNN.
Classification of handwritten digits (MNIST dataset). Uses full gradient decent without batching the model yields an accuracy of ~ 90 %. The confusion matrix is displayed below.
Analysing sea-level dataset from NASA. AR, MA and ARMA models are used, in addition to a nonlinear AR model using feed forward neural networks. The results of the network approach are, however, quite disappointing.