park4night App (Data Science Project)

Project Overview

This project is a result of my final thesis at the Data Science. Institute by Fabian Rappert.
Scraped almost 300000 places and over 2.7 million comments from park4night using python.
Created a mySQL database.
Created a Streamlit App for searching and filtering places.
Made a Data Analyses with Tableau on the scraped data.

Code and Resources Used

Python Version: 3.11
Packages: requests, BeautifulSoup, pandas

Web Scraping

Every place has a unique place ID and with that an own web-address accessible with the base-url: https://park4night.com/en/place/ followed by the place ID. For example: https://park4night.com/en/place/88726 The following pictures shows where to find the scraped data.

The webscraping uses requests and beautifulsoup. To save the scraped data, parquet will be used. You can select how many pages you want to scrape with the variable 'pages_to_scrape'. By default the main program starts at the last scraped place ID if there already exists an 'data_base.parquet' file. If not a new pandas DataFrame will created.

The DataFrame looks like this:

Data Cleaning

After scraping the data, I needed to clean it up so that I could create a mySQL database. Therefore I created a Jupyter Notebook.

...to be continued

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
scraping		scraping
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

park4night App (Data Science Project)

Project Overview

Code and Resources Used

Web Scraping

Data Cleaning

About

Releases

Packages

Languages

BenediktFranck/park4night_app

Folders and files

Latest commit

History

Repository files navigation

park4night App (Data Science Project)

Project Overview

Code and Resources Used

Web Scraping

Data Cleaning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages