This project was completed as part of Udacity's Data Analyst Nanodegree Program certification in October 2020.
I am going to analyze the TMDb Movies Dataset. The dataset, originally from kaggle, was cleaned and provided by Udacity. It's a collection of more than 5000 movies which include plentiful data on the release year, genres, cast, directors, runtimes, budgets, revenues and production companies. The dataset covers movies with release dates from 1960 to 2015.
- Python, Numpy, Pandas, Matplotlib and Seaborn
- Jupyter Notebook
I am going to find an answer to the following questions: 1 - Which movie genres are most popular from year to year? 2 - What are the 10 most popular movies between 1960 and 2015? 3 - What are the properties associated with high revenue movies? 4 - How have the anuual profitability of the movies changed over time? and what is the most contributing factor to the annual profittability? 5 - What is the average money made by each movie? 6 - Have the movies become shorter or longet over the years?