CitySpire - Data Science

Mission

Be a one-stop resource for users to receive the most accurate city information.

Description

An app that analyzes data from cities such as populations, cost of living, rental rates, crime rates, park (walk score), and many other social and economic factors that are important in deciding where someone would like to live. This app will present such important data in an intuitive and easy to understand interface.

Use data to find a place right for you to live.

Data Engineering

FastAPI app, deployed to AWS, provides 3 primary routes:

/cityspire is a GET route that provides all of the data in the database in a table format.
/locations is a GET route that provides a list of all cities in the database.
/location/data is a POST route that takes a request of location in the form of "City, State" and returns all of the data about that location.

Type	Endpoint	Required Parameters	Returns
GET	/cityspire	none	"[[0, 0, "Akron, Ohio", 197597.0, 678.0, 1782.0, 27.0, 181.0, 328.0, 1246.0, 6568.0, 1686.0, 4305.0, 577.0, 65.0, 8484.440553247267, 46, 46, 90.8, 7972.779227752733], ...]"
GET	/locations	none	{ "locations": ["Akron, Ohio", "Albany, New York", ...] }
POST	/location/data/	"location": "City, State"	{ "city_name": "El Paso, Texas", "population": 681728, "rent_per_month": 990, "walk_score": 41, "livability_score": 12687 }

[TODO] More details about the API endpoints can be found at the ReDoc interface or by exploring the interactive SwaggerUI.

Machine Learning

Nearest Neighbors ML Model for CitySpire City Living Recommendations

The data wrangling and merging and can be found in the wrangling.ipynb notebook, while the tokening and TFIDF vectoring of text, creation of Nearest Neighbors model, training on tokenized and vectorized text, and pickling of Nearest Neighbors Model and TFIDF Vectorizer can all be found in the rec_modeling.ipynb notebook in the notebooks directory.

The Nearest Neighbors and TFIDF Vectorizer pickles can be found in the pickles directory.

The pickled Nearest Neighbors model and TFIDF Vectorizer are imported into recommend.py in the app directory so that they can be used in a recommend function in the Data Engineering API in order to recommend cities to live in to users based on desired population, rental rate, crime rate, walkability score, cost of living index, and livability score.

Deployment

The CitySpire API is backed by a Postgres DB in AWS RDS. The data was uploaded to the DB using the df_to_sql.py script in the notebooks directory.

After you create your own PG DB on AWS RDS you need to add the DB URL to a .env file:

DATABASE_URL=postgresql://DBusername:DBpassword@blah.blah.blah.us-east-1.rds.amazonaws.com/dbname

Commands to deploy locally:

Create virtual environment in root directory of project: pipenv shell

Install project dependencies in virutal environment: pipenv install --dev

Launch app locally: uvicorn app.main:app --reload

Launch app locally on different port: uvicorn app.main:app --reload --port 8080

The API app is deployed to AWS Elastic Beanstalk using a Dockerfile. It is crucial to organize all of the app directories into the app directory because the Dockerfile copies the app structure from the app directory, not the root directory of this repo.

Documentation on how to set up AWS and EB CLI

Commands to deploy to Elastic Beanstalk:

Commit your work: git add --all git commit -m "Your commit message"

Then use these EB CLI commands (Elastic Beanstalk command line interface) to deploy. (Replace CHOOSE-YOUR-NAME with your own name.) eb init --platform docker --region us-east-1 CHOOSE-YOUR-NAME eb create --region us-east-1 CHOOSE-YOUR-NAME

Do you have environment variables? Then configure environment variables in the Elastic Beanstalk console.

Now you can open your deployed app! 🎉 eb open

Commands to redeploy to Elastic Beanstalk:

Commit your work: git add --all git commit -m "Your commit message"

Then use these EB CLI commands (Elastic Beanstalk command line interface) to re-deploy. eb deploy eb open

It is also possible to redeploy without committing your work with these commands: git add . eb deploy --staged

Data Sources

Population Data - https://www2.census.gov/programs-surveys/popest/tables/2010-2019/cities/totals/SUB-IP-EST2019-ANNRES.xlsx

Rental Rates - https://files.zillowstatic.com/research/public_v2/zori/Zip_ZORI_AllHomesPlusMultifamily_SSA.csv

Crime Rates - https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/table-8/table-8.xls/view

Walk Scores - https://www.walkscore.com/cities-and-neighborhoods/

Cost of Living Index - https://advisorsmith.com/data/coli/

Contributors

John Dailey	Neha Kumari	Theda	Mickey Wells
Data Scientist	Data Scientist	Data Scientist	Data Scientist

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
KnownDefects.md		KnownDefects.md
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
cityspire.png		cityspire.png
labs-pt15-about.md		labs-pt15-about.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CitySpire - Data Science

Mission

Description

Data Engineering

Machine Learning

Deployment

Data Sources

Contributors

About

Releases

Packages

Contributors 3

Languages

License

BloomTech-Labs/PT15-cityspire-c-ds

Folders and files

Latest commit

History

Repository files navigation

CitySpire - Data Science

Mission

Description

Data Engineering

Machine Learning

Deployment

Data Sources

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages