I do not own the rights to this photo
Date Started: 9/10/2023
- Using an API, I extracted Yelp data and merged it with health inspections data from city-wide inspections through the health department for Pizza restaurants in Brooklyn.
- By employing hypothesis testing, I attempted to investigate the relationship between elevated inspection scores and the likelihood of restaurants in Brooklyn that serve pizza garnering enhanced reviews on Yelp.
- I used multiple visualization techniques to represent the various aspects of Brooklyn Pizza Restaurants.
https://catalog.data.gov/dataset/dohmh-new-york-city-restaurant-inspection-results
YELP API
-
https://www.nyc.gov/site/doh/business/food-operators/letter-grading-for-restaurants.page
-
https://www.nyc.gov/assets/doh/downloads/pdf/rii/restaurant-grading-faq.pdf
This dataset includes NYC restuarant inpection data for the last three years prior to the most recent inspection and does not include restaurants that have gone out of business. Letter grading inspections were put on pause beginning March 17, 2020, until July 19, 2021, due to the COVID-19 public health emergency. Modified restaurant inspections occurred during this time. Restaurants are uniquely identified by their CAMIS number. Only restaurants that are currently active as of the date of this extraction are included in the dataset. Establishments with inspection date of 1/1/1900 are new establishments that have not yet received an inspection. These will be excluded from parts of the project.
I only included YELP data specifically for pizza restaurants in Brooklyn,NY. This helped to focus on a smaller dataset to search for correlation between the inspection scores and ratings.
-
I used feature engineering to create multiple new columns in the dataset to help gather a more in-depth look into the data.
-
I used hypothesis testing to find out whether or not there was any correlation between better inspections scores mean better ratings on Yelp. I used Z-scores to help identify any outliers and eliminated them before performing 3 different hypothesis tests. Unfortunately, there was no significant correlation found.
-
I used many different visualization methods using matplotlib, seaborn, and pandas. I also included a geospatial visualization summing up the majority of my findings.
- The majority of the restaurants scored between 5-30, which means they're able to remain open after a sucessful initial inspection.
- A lower number score indicates a better overall performance in the inspection.
- This shows that Brooklyn and Manhattan has the best inspection scores for pizza restaruants compared to the other boroughs.
- This shows that Brooklyn has the second highest number of restaurants, Manhattan having the most.
- This contingency table shpws how many restaurants each borough had in each grade.
- Brooklyn had the second highest number of A's and B's.
- The lower the number for the score, the better the restaurant performed on their inspection.
- Brooklyn's average is actulally the worst overall amoungst all the boros.
- As shown above, more reviews does not always mean better ratings on YELP.
- This visualization shows that the majority of Pizza restaurants in Brooklyn are rated between 3.5 and 4.5
- This visualization shows the review counts for Pizza restaurants in Brooklyn with at least 200 reviews.
- The link below contains an interactive geospatical visualization that contains information about every pizza rstaurant in Brooklyn.
This project was focused on data involving Pizza Restaurants in Brooklyn, NY. Using health inspection data and YELP data to compare pizza restaurants performance.