Skip to content

subalasingh/Exploratory-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

🌟 Exploratory-Data-Analysis

Star Badge View Main Folder View Repositories View My Profile

The purpose of this project is to master the exploratory data analysis (EDA) with different datasets.

🎯 Goals of the Project:

  1. Explore datasets with Pandas framework.
  2. Visualize the dataset with various plot types.

📚 Projects

The dataset is from Kaggle: IMDB data from 2006 to 2016 and contains information about 1,000 movies collected from The Movie Database (IMDb), including rating, revenue, year, runtime and genres. In this analysis, we set out to analyze IMDB movie dataset to get insights and answer all our burning curiousity. I tried to give answers to a set of questions that may be relevant when analyzing movie data.

🛠️ Questions:

  1. Display Top 10 Rows of The Dataset
  2. Check Last 10 Rows of The Dataset
  3. Find Shape of Our Dataset (Number of Rows And Number of Columns)
  4. Getting Information About Our Dataset Like Total Number Rows, Total Number of Columns, Datatypes of Each Column And Memory Requirement
  5. Check Missing Values In The Dataset
  6. Drop All The Missing Values
  7. Check For Duplicate Data
  8. Get Overall Statistics About The DataFrame
  9. Display Title of The Movie Having Runtime Greater Than or equal to 180 Minutes
  10. In Which Year There Was The Highest Average Voting?
  11. In Which Year There Was The Highest Average Revenue?
  12. Find The Average Rating For Each Director
  13. Display Top 10 Lengthy Movies Title and Runtime
  14. Display Number of Movies Per Year
  15. Find Most Popular Movie Title (Highest Revenue)
  16. Display Top 10 Highest Rated Movie Titles And its Directors
  17. Display Top 10 Highest Revenue Movie Titles
  18. Find Average Rating of Movies Year Wise
  19. Does Rating Affect The Revenue?
  20. Classify Movies Based on Ratings [Excellent, Good, and Average]
  21. Count Number of Action Movies
  22. Find Unique Values From Genre
  23. How Many Films of Each Genre Were Made?

📃 Used Libraries:

  1. Pandas
  2. Matplotlib
  3. Seaborn

© 2021 Subala Singh

About

EDA Projects using Python and Pandas Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published