A Chinese automobile company Geely Auto aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts.
They have contracted an automobile consulting company to understand the factors on which the pricing of cars depends. Specifically, they want to understand the factors affecting the pricing of cars in the American market, since those may be very different from the Chinese market. The company wants to know:
- Which variables are significant in predicting the price of a car
- How well those variables describe the price of a car
Based on various market surveys, the consulting firm has gathered a large dataset of different types of cars across the Americal market.
You are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels. Further, the model will be a good way for management to understand the pricing dynamics of a new market.
Toyota
seemed to be favored car company.- Number of
gas
fueled cars are more thandiesel
. sedan
is the top car type prefered.
- It seems that the symboling with
0
and1
values have high number of rows (i.e. They are most sold.) - The cars with
-1
symboling seems to be high priced (as it makes sense too, insurance risk rating -1 is quite good). But it seems that symboling with 3. value has the price range similar to-2
value. There is a dip in price at symboling1
.
ohc
Engine type seems to be most favored type.ohcv
has the highest price range (Whiledohcv
has only one row),ohc
andohcf
have the low price range.
Jaguar
andBuick
seem to have highest average price.diesel
has higher average price than gas.hardtop
andconvertible
have higher average price.
doornumber
variable is not affacting the price much. There is no sugnificant difference between the categories in it.- It seems aspiration with
turbo
have higher price range than thestd
(though it has some high values outside the whiskers.)
![image](https://user-images.githubusercontent.com/98708792/233800354-072904b3-75a8-4644-8c34-6f3a71c96e39.png)
#### Inference :
- Very few datapoints for
enginelocation
categories to make an inference. - Most common number of cylinders are
four
,six
andfive
. Thougheight
cylinders have the highest price range. mpfi
and2bbl
are most common type of fuel systems.mpfi
andidi
having the highest price range. But there are few data for other categories to derive any meaningful inference- A very significant difference in drivewheel category. Most high ranged cars seeme to prefer
rwd
drivewheel.
![image](https://user-images.githubusercontent.com/98708792/233800392-ee7ddc04-3bae-43e8-aaa4-5033691fb3df.png)
#### Inference :
carwidth
,carlength
andcurbweight
seems to have a poitive correlation withprice
.carheight
doesn't show any significant trend with price.
- R-sqaured and Adjusted R-squared (extent of fit) - 0.899 and 0.896 -
90%
variance explained. - F-stats and Prob(F-stats) (overall model fit) - 308.0 and 1.04e-67(approx. 0.0) - Model fir is significant and explained
90%
variance is just not by chance. - p-values - p-values for all the coefficients seem to be less than the significance level of 0.05. - meaning that all the predictors are statistically significant.