Skip to content

To give people an estimate of how much they need based on their individual health situation. After that, customers can work with any health insurance carrier and its plans and perks while keeping the projected cost from our study in mind. I am considering variables as age, sex, BMI, number of children, smoking habits and living region to predict…

Notifications You must be signed in to change notification settings

Sushil-Deore/Insurance-Premium-Prediction

Repository files navigation

Insurance Premium Prediction:

To give people an estimate of how much they need based on their individual health situation. After that, customers can work with any health insurance carrier and its plans and perks while keeping the projected cost from our study in mind. I am considering variables as age, sex, BMI, number of children, smoking habits and living region to predict the premium. This can assist a person in concentrating on the health side of an insurance policy rather than the ineffective part.

_ Data source _: https://www.kaggle.com/noordeen/insurance-premium-prediction

Approach:

  1. Loading the dataset using Pandas and performed basic checks like the data type of each column and having any missing values.
  2. Performed Exploratory data analysis:
  • Visualized each predictor or independent feature with the target feature and found that there's a direct proportionality between cement and the target feature while there's an inverse proportionality between water and the target feature.
  • To get even more better insights, plotted both Pearson and Spearman correlations, which showed the same results as above.
  • the distribution of the target feature, expenses which was in Normal distribution with a very little right skewness.
  • Checked for the presence of outliers in all the columns
  1. Experimenting with various ML algorithms
  • First, tried with Linear regression models, ridge and lasso regression approached. Performance metrics are calculated for all the approaches. The test RMSE score is little bit lesser compared to other approaches. Then, performed a residual analysis and the model satisfied all the assumptions of linear regression.
  • Next, tried with various tree based models, performed hyper parameter tuning using the GridSearchCV and found the best hyperparameters for each model. Then, picked the top most features as per the feature importance by an each model. Models, evaluated on both the training and testing data and recorded the performance metrics.
  • Based on the performance metrics of both the linear and the tree based models, XGBoost regressor performed the best, followed by the random forest regressor. 4.Deployment: Deployed the XGBoost regressor model using Flask, which works in the backend part while for the frontend UI Web page, used HTML5.

At each step in both development and deployment parts, logging operation is performed which are stored in the Jupyter_Notebook_logs.log and app_deployment_logs.log files respectively

So, now we can find the insurance premium quickly by just passing the mentioned details as an input to the web application 😊

Web Deployment

Deployed on web using Heroku (PaaS) url: https://insurancepremiumpred.herokuapp.com/

Screenshots

UI

Tools and Technologies used

Tools

High Level Design:

URL: https://drive.google.com/file/d/1ZAUo4g3TwfyOeyUFykpqMeX9jBt8ncve/view?usp=sharing

Low Level Design:

URL: https://drive.google.com/file/d/1k71y1neg0DjfPJGub7KRl9NJsltYnYO_/view?usp=sharing

Architecture:

URL: https://drive.google.com/file/d/1rEfdDFm_MEq0oBVvuj5JGaI-kkdy6zIG/view?usp=sharing

Detailed Project report:

URL: https://drive.google.com/file/d/1JfNJtK5MxruYYZ_2jfrY8SBKtQAxqIME/view?usp=sharing

Wireframe document:

URL: https://drive.google.com/file/d/1j21TTP3bh5hZZuWBodOAvTIWGWR39aKn/view?usp=sharing

Demo video:

URL: https://drive.google.com/file/d/1c8dNntxZRUIMCpzDu7zjVHIl5Fb43UTz/view?usp=sharing

Reference:

Author:

About

To give people an estimate of how much they need based on their individual health situation. After that, customers can work with any health insurance carrier and its plans and perks while keeping the projected cost from our study in mind. I am considering variables as age, sex, BMI, number of children, smoking habits and living region to predict…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages