Skip to content

In this R project, we use variable selection, regularization (Lasso & Ridge), PCR, and PLS to find the best model for this dataset.

Notifications You must be signed in to change notification settings

sshreyas999/Model-Selection-on-Auto-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Model Selection on Auto Data

Code Written in R using RStudio Notebook. Open the R Markdown file here for code and commentary.

Objective

Our goal is to build a model that can predict mpg. We want to be able to predict the mileage of a vehicle from other attributes.

Dataset

The Auto dataset is available in the ISLR package. The dataset contains 392 observations with 9 attributes for each observation. The attributes are briefly described below:

  1. mpg - miles per gallon
  2. cylinders - Number of cylinders between 4 and 8
  3. displacement - Engine displacement (cu. inches)
  4. horsepower - Engine horsepower
  5. weight - Vehicle weight (lbs.)
  6. acceleration - Time to accelerate from 0 to 60 mph (sec.)
  7. year - Model year (modulo 100)
  8. origin - Origin of car (1. American, 2. European, 3. Japanese)
  9. name - Vehicle name

We ignore the name attribute as it is too varied to include in the model. We use all the data to train the model, and compute test error through cross validation.

Outline

Exploratory Analysis of Dataset

See what variables are useful in predicting the outcome. Perform transformations as required.

Model Fitting

Fit the model using:

  1. Standard Least Squares
  2. Best-subset selection
  3. Ridge regression
  4. Lasso regularization
  5. Principal Component Regression (PCR)
  6. Partial Least Squares (PLS)

Comparision & Conclusions

Compare coefficients, MSE, and find the best model.

Conclusion

The best model was achived through Partial Least Squares (PLS). It gave us the lowest MSE - 8.677.

About

In this R project, we use variable selection, regularization (Lasso & Ridge), PCR, and PLS to find the best model for this dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published