Data Analysis & Manipulation With Pandas

Data analysis with Pandas, Numpy, Matplotlib & Seaborn.

Project consists to analyse a publicly available movie dataset found in https://www.kaggle.com/beyjin/movies-1990-to-2017 and use Python tools like Pandas in order to get some initial insights about the dataset and finally proceeding to clean, transform and save a new version of the dataset in a better structure thinking about storing the data in a database.

Index:

Introduction
1. initial_insights.ipynb
2. clean_datasets.ipynb
3. cleaned_datasets_grouped.ipynb
4. Raw & Cleaned Datasets
Information
Maintainer

Introduction

There are 3 files which you can look in this exact order

initial_insights.ipynb

Taking a first look to the raw datasets and finding insights that help us understand the data we will be processing and also to get an overview on how we should structure the datasets as if we where going to store the data into a database

Note: insights and conclusions can be found in the jupyter file
clean_datasets.ipynb

We go here through the whole process standardizing the data types, extracting columns that should go in a different dataset and saving the and cleaned datasets.

Note: Target Database Schema
cleaned_datasets_grouped.ipynb

Here we take the cleaned datasets and we just join them all together into a big and only one dataset
Raw & Cleaned Datasets
- The original datasets (raw) are located in the folder orignal_datasets/
- The output generated datasets (cleaned) will be located in the folder output/

Information:

Technology Stack
Python		Language
Pandas		Data Analysis & Manipulation
Numpy		Data Computing
Matplotlib		Data Visualization
Seaborn		Data Visualization

Maintainer

Get in touch -–> fantaso

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
original_datasets		original_datasets
output/cleaned_datasets		output/cleaned_datasets
readme		readme
.gitignore		.gitignore
README.md		README.md
clean_datasets.ipynb		clean_datasets.ipynb
cleaned_datasets_grouped.ipynb		cleaned_datasets_grouped.ipynb
initial_insights.ipynb		initial_insights.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis & Manipulation With Pandas

Index:

Introduction

Information

Maintainer

Introduction

Information:

Maintainer

About

Releases

Packages

Languages

Fantaso/data-analysis-and-manipulation-with-pandas

Folders and files

Latest commit

History

Repository files navigation

Data Analysis & Manipulation With Pandas

Index:

Introduction

Information

Maintainer

Introduction

Information:

Maintainer

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages