The banking institution needs to gain actionable insights into mortgage-backed securities, geographic business investment, and real estate analysis. The mortgage bank would like to identify potential monthly mortgage expenses for each region based on monthly family income and rental of real estate. A statistical model needs to be created to predict the potential demand in dollars amount of loans for each region in the USA. Also, there is a need to create a dashboard that would refresh periodically post-data retrieval from the agencies. The dashboard must demonstrate relationships and trends for the key metrics, such as the number of loans, average rental income, monthly mortgage and owner's cost, family income vs. mortgage cost comparison across different regions.
The dataset contains the following variables:
Second Mortgage: Households with a second mortgage statistics Home Equity: Households with a home equity loan statistics Debt: Households with any type of debt statistics Mortgage Costs: Statistics regarding mortgage payments, home equity loans, utilities, and property taxes Home Owner Costs: Sum of utilities and property taxes statistics Gross Rent: Contract rent plus the estimated average monthly cost of utility features High school Graduation: High school graduation statistics Population Demographics: Population demographics statistics Age Demographics: Age demographic statistics Household Income: Total income of people residing in the household Family Income: Total income of people related to the householder Objectives:
To create a statistical model to predict the potential demand in dollars amount of loans for each region in the USA based on monthly family income and rental of real estate. To create a dashboard that would refresh periodically post-data retrieval from the agencies. To demonstrate relationships and trends for the key metrics such as the number of loans, average rental income, monthly mortgage and owner's cost, family income vs. mortgage cost comparison across different regions. To analyze the dataset using exploratory data analysis techniques, such as data visualization, to gain insights into the population density, age demographics, and debt analysis. To perform correlation analysis for all the relevant variables by creating a heatmap. Technologies Used:
Python - for data analysis and machine learning
Excel - for data cleaning, transformation, and visualization
Tableau - for data reporting
In conclusion, this project aims to gain actionable insights into mortgage-backed securities, geographic business investment, and real estate analysis. A statistical model has been created to predict the potential demand in dollars amount of loans for each region in the USA based on monthly family income and rental of real estate. A dashboard has been created that would refresh periodically post-data retrieval from the agencies. The dashboard demonstrates relationships and trends for the key metrics such as the number of loans, average rental income, monthly mortgage and owner's cost, family income vs. mortgage cost comparison across different regions. The dataset has been analyzed using exploratory data analysis techniques, such as data visualization, to gain insights into the population density, age demographics, and debt analysis. Finally, correlation analysis has been performed for all the relevant variables by creating a heatmap. The project used R for data analysis and machine learning, Excel for data cleaning, transformation, and visualization, and Tableau for data reporting.