Skip to content

This repository hosts the final project for the Data Management exam, as part of the Master’s degree in Data Science at Sapienza University of Rome

Notifications You must be signed in to change notification settings

Livia020799/Data-Management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data-Management

This repository hosts the two assignments for the Data Management exam, held by Professors Domenico Lembo and Riccardo Rosati, as part of the Master’s degree in Data Science at Sapienza University of Rome.

I worked on the assignments together with Laura Thoft Cesario (@laurathoft)


Exam Structure

The Data Management exam comprised two homework assignments focused on database construction. Initially, we were required to build a database using a relational database system. Subsequently, the task was to replicate the database using a non-relational system.
In the first assignment, we were tasked with creating 10 queries to extract information from the data and improving the execution time of the slowest queries.
The second assignment involved recreating the same database structured in the relational DBMS but using a non-relational system. We were then asked to replicate the queries from the first assignment in this new environment to achieve similar results.
Detailed guidelines and rules for the assignments are available in the PDF Homework assignments and rules (DM), which is uploaded in the repository.


Assigments Overview

For both assignments, we utilized the same dataset featuring historical records from the Olympic Games, spanning all events from Athens 1896 to Beijing 2022. This comprehensive dataset includes details on over 21,000 medals, 162,000 results, 74,000 athletes, 20,000 biographies, and 53 host cities of the Summer and Winter Olympic Games, sourced from Kaggle.

For the first assignment, which required the use of a relational database, we chose MySQL. For the second assignment, which involved a non-relational database, we opted for Neo4j, a graph-based database.

In the respective assignment folders, HM1 and HM2, you will find all the necessary data and code to replicate the projects. Each folder also includes detailed commentary on the queries and the database creation processes for both of the DBMSs we used.


Visualization

To get an idea of what the Olympic database looks like in Neo4j, see the image below: image


Exam Score and Project Usage

This assigments received a perfect score of 30 cum laude out of 30 on the final exam. Feel free to use it as a reference if you are planning to take the exam in the upcoming years.
Please do not hesitate to contact me if you need further explanations or encounter any issues with the materials.

About

This repository hosts the final project for the Data Management exam, as part of the Master’s degree in Data Science at Sapienza University of Rome

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published