Skip to content

Latest commit

 

History

History
11 lines (10 loc) · 1.46 KB

README.md

File metadata and controls

11 lines (10 loc) · 1.46 KB

Big-Data-Engineering

This repository contains the homeworks and other staff regards to the Big Data Engineering course (AY 21/22) at the University of Naples Federico II.

Homeworks

All the homeworks have been developed in team of 2.

  • Homework1-MongoDB: design and development of a NoSQL database using MongoDB Compass for the storing of Yelp Dataset collections.
  • Homework2-ApachePig: processing of Yelp Dataset Reviews collection using Apache Pig, with Pig Latin language.
  • Homework3-ApacheSpark: distributed processing using Spark (PySpark) with support of Google Colab for the analysis of Yelp Dataset collections.
  • HomeworkFinale-KPMG: data analysis of MIUR and ISTAT open dataset on university students enrolling using Python for the pre-processing of the data, MongoDB for the storage, and Apache Spark for the analysis.

KPMG Hackaton

  • KPMG-UniversityTrends: a Python elaboration of MIUR and ISTAT open dataset, and an analysis of university trends with Pandas DataFrame, with development of some dashboards in Microsoft Power BI.