Kaggle: https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques
With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.
- Data Import and exploration
- Data Processing
- Missing Data
- Outliers
- Data Type Correction
- Label encoding
- Skewness
- Feature Engineering: New features
- Modelling
- Lasso Regression
- Ridge Regression
- Net Elastic Regression
- Performance Validation
- Train/Validation Split
- Cross-Validation
A detailed explanation can be found in the report attached.
After EDA and data pre-processing, we trained 90% of the training data to validate 10% of the remaining dataset. Then, we ran both Lasso and Ridge regularization techniques by running cross-validation and finalized the best model based on their accuracy. In our analysis, the Lasso was proven to be the most effective for our problem with the lowest Mean Square Error.