This repository hosts the two assignments for the Data Management exam, held by Professors Domenico Lembo and Riccardo Rosati, as part of the Master’s degree in Data Science at Sapienza University of Rome.
I worked on the assignments together with Laura Thoft Cesario (@laurathoft)
The Data Management exam comprised two homework assignments focused on database construction. Initially, we were required to build a database using a relational database system. Subsequently, the task was to replicate the database using a non-relational system.
In the first assignment, we were tasked with creating 10 queries to extract information from the data and improving the execution time of the slowest queries.
The second assignment involved recreating the same database structured in the relational DBMS but using a non-relational system. We were then asked to replicate the queries from the first assignment in this new environment to achieve similar results.
Detailed guidelines and rules for the assignments are available in the PDF Homework assignments and rules (DM), which is uploaded in the repository.
For both assignments, we utilized the same dataset featuring historical records from the Olympic Games, spanning all events from Athens 1896 to Beijing 2022. This comprehensive dataset includes details on over 21,000 medals, 162,000 results, 74,000 athletes, 20,000 biographies, and 53 host cities of the Summer and Winter Olympic Games, sourced from Kaggle.
For the first assignment, which required the use of a relational database, we chose MySQL
. For the second assignment, which involved a non-relational database, we opted for Neo4j
, a graph-based database.
In the respective assignment folders, HM1
and HM2
, you will find all the necessary data and code to replicate the projects. Each folder also includes detailed commentary on the queries and the database creation processes for both of the DBMSs we used.
To get an idea of what the Olympic database looks like in Neo4j, see the image below:
This assigments received a perfect score of 30 cum laude out of 30 on the final exam. Feel free to use it as a reference if you are planning to take the exam in the upcoming years.
Please do not hesitate to contact me if you need further explanations or encounter any issues with the materials.