- Performed Time series Analysis
- Created multi linear regression model for predicting Weekly sales
- Performed Residual Analysis
- Performed Model Diagnostics
- Tested model with new Data
The dataset used in this Project is 'Walmart.csv'.
caTools
library for splitting the data into training and testing sets.
-
Import the Dataset: Import the dataset using the
read.csv()
function. -
Data Summary: It provides a summary of the dataset.
-
Time Series Analysis: converted the 'Date' column to Date type and sorts the data by date. It then plots the time series of weekly sales.
-
Correlation Calculation: It calculates the correlation between 'Fuel_Price' and 'Weekly_Sales'.
-
Regression Analysis: Splited the data into a training set (70% of the data) and a testing set (30% of the data) using the
sample.split()
function from thecaTools
library. Then Fitted a multiple linear regression model to predict 'Weekly_Sales' based on 'Temperature', 'Fuel_Price', 'CPI', and 'Unemployment'. The model is summarized using thesummary()
function. -
Residual Analysis: This Analysis calculates the residuals (the difference between the observed and predicted values) and plots a histogram and a Q-Q plot of the residuals.
-
Model Diagnostics: The Function generates diagnostic plots of the model using the
plot()
function. -
Prediction Using New Data: Predicted the 'Weekly_Sales' for new data by creating dataFrame and using the
predict()
function.
The output of the Program includes the summary of the regression model, the residuals, the diagnostic plots, and the predicted 'Weekly_Sales' for the new data.
Please ensure that the 'Walmart.csv' file is in the correct directory before running the program or use file.choose()
function to choose file. Also, make sure to install the caTools
library in R using install.packages("caTools")
if it's not already installed.