Skip to content

BenediktFranck/park4night_app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

park4night App (Data Science Project)

Project Overview

  • This project is a result of my final thesis at the Data Science. Institute by Fabian Rappert.
  • Scraped almost 300000 places and over 2.7 million comments from park4night using python.
  • Created a mySQL database.
  • Created a Streamlit App for searching and filtering places.
  • Made a Data Analyses with Tableau on the scraped data.

Code and Resources Used

Python Version: 3.11
Packages: requests, BeautifulSoup, pandas

Web Scraping

Every place has a unique place ID and with that an own web-address accessible with the base-url: https://park4night.com/en/place/ followed by the place ID. For example: https://park4night.com/en/place/88726 The following pictures shows where to find the scraped data.

Bildschirmfoto 2023-12-19 um 13 34 59 Bildschirmfoto 2023-12-19 um 13 39 16

The webscraping uses requests and beautifulsoup. To save the scraped data, parquet will be used. You can select how many pages you want to scrape with the variable 'pages_to_scrape'. By default the main program starts at the last scraped place ID if there already exists an 'data_base.parquet' file. If not a new pandas DataFrame will created.

The DataFrame looks like this: df

Data Cleaning

After scraping the data, I needed to clean it up so that I could create a mySQL database. Therefore I created a Jupyter Notebook.

...to be continued

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages