-
-
Notifications
You must be signed in to change notification settings - Fork 216
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #527 from Bayyana-kiran/main
The Simpsons Episodes Dataset Analysis
- Loading branch information
Showing
17 changed files
with
12,551 additions
and
0 deletions.
There are no files selected for viewing
6,723 changes: 6,723 additions & 0 deletions
6,723
Simpsons Episodes Analysis/Dataset/simpsons_characters.csv
Large diffs are not rendered by default.
Oops, something went wrong.
601 changes: 601 additions & 0 deletions
601
Simpsons Episodes Analysis/Dataset/simpsons_episodes.csv
Large diffs are not rendered by default.
Oops, something went wrong.
4,460 changes: 4,460 additions & 0 deletions
4,460
Simpsons Episodes Analysis/Dataset/simpsons_locations.csv
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
Binary file added
BIN
+41.1 KB
Simpsons Episodes Analysis/Images/Average IMDb ratings per season.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+42 KB
Simpsons Episodes Analysis/Images/Bottom Episodes based on IMDb rating.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+44.9 KB
Simpsons Episodes Analysis/Images/Top Episodes based on IMDb rating.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
666 changes: 666 additions & 0 deletions
666
Simpsons Episodes Analysis/Model/Simpsons_Analysis.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
# Data Analysis Report: The Simpsons Dataset | ||
|
||
## Introduction | ||
This report presents a comprehensive analysis of The Simpsons dataset, encompassing characters, episodes, locations, and script lines. The analysis explores various aspects such as character interactions, episode ratings, viewership trends, sentiment analysis, and additional advanced analyses. | ||
|
||
## Data Loading and Cleaning | ||
The datasets were loaded into Pandas DataFrames, and basic cleaning steps were performed to handle encoding issues. The following datasets were used: | ||
- simpsons_characters | ||
- simpsons_episodes | ||
- simpons_locations | ||
- simpons_script_lines | ||
|
||
DataSets: https://www.kaggle.com/datasets/thedevastator/the-simpsons-episodes-dataset | ||
|
||
## Character Analysis | ||
|
||
### Top Speaking Characters | ||
Identified and visualized the top characters based on the number of lines spoken. | ||
|
||
![Top Speaking Characters](https://github.com/Bayyana-kiran/sdf/assets/99533113/60bd9db2-8bc4-4ffe-b333-75fcd6d8c2f0) | ||
|
||
|
||
### Gender Distribution of Speaking Characters | ||
Explored the distribution of lines spoken by male and female characters. | ||
|
||
![Gender Distribution](https://github.com/Bayyana-kiran/sdf/assets/99533113/3f43f0b7-53c3-4a11-ab76-69044cbfab55) | ||
|
||
|
||
### Character Lines Pie Chart | ||
Visualized the percentage of lines spoken by each character in a pie chart. | ||
|
||
![Character Lines pie Chart](https://github.com/Bayyana-kiran/sdf/assets/99533113/6fa8afb2-0529-40c1-ab50-d177ee9fab87) | ||
|
||
|
||
## Episode Analysis | ||
|
||
### Season-wise Episode Count | ||
Visualized the number of episodes in each season. | ||
|
||
![Episode count per season](https://github.com/Bayyana-kiran/sdf/assets/99533113/632a943e-b57b-4abc-b0ea-1885c3693630) | ||
|
||
|
||
|
||
|
||
### Seasonal Viewership Bar Chart | ||
Visualized the average viewership per season using a bar chart. | ||
|
||
![Seasonal Viewrship](https://github.com/Bayyana-kiran/sdf/assets/99533113/7368c975-fa54-49a4-9ee5-8572ec98726c) | ||
|
||
## Location Analysis | ||
|
||
### Top Locations Across Episodes | ||
Identified and visualized the most popular locations based on the number of lines spoken. | ||
|
||
![Top Locations across episodes](https://github.com/Bayyana-kiran/sdf/assets/99533113/d6fb1949-e18d-4fae-b95e-0017bc36faef) | ||
|
||
|
||
|
||
## Text Analysis | ||
|
||
### Word Cloud of Spoken Words | ||
Created a word cloud to visualize the most frequently used words in the spoken lines. | ||
|
||
![Word Cloud of spoken words](https://github.com/Bayyana-kiran/sdf/assets/99533113/90fbb2aa-7ca3-4a0c-8e14-592581db492a) | ||
|
||
|
||
### Sentiment Analysis Over Time | ||
Analyzed the sentiment of spoken words over time. | ||
|
||
![Sentiment Analyis over time](https://github.com/Bayyana-kiran/sdf/assets/99533113/c4390f94-6e7a-4871-a230-c477f6ac9e6c) | ||
|
||
|
||
# Conclusion | ||
|
||
## Top Characters | ||
- Identified leading characters by lines spoken. | ||
- Explored gender distribution in dialogues. | ||
|
||
## Episode Insights | ||
- Visualized season-wise episode counts. | ||
- Analyzed average viewership trends. | ||
|
||
## Location Highlights | ||
- Identified popular locations across episodes. | ||
|
||
## Text Analysis | ||
- Word cloud for frequent words. | ||
- Sentiment analysis over time. | ||
|
||
## Summary | ||
- Provided comprehensive insights. | ||
- Dataset lays the groundwork for further exploration. | ||
|
||
In this extensive analysis of The Simpsons dataset, we delved into various facets of the animated series, providing nuanced insights into character dynamics, episode trends, and textual patterns. By identifying top-speaking characters, exploring gender distribution in dialogues, and visualizing episode counts and viewership trends, we gained a comprehensive understanding of the show's landscape. Additionally, the analysis of popular locations across episodes and the examination of spoken words through word clouds and sentiment analysis added depth to our exploration. This report serves as a robust foundation for further investigations into The Simpsons dataset, offering a wealth of information for researchers and enthusiasts alike. | ||
|
||
|
||
--- | ||
## Contributor: Sai Kiran B L S | ||
|
||
Github: [Sai Kiran B L S](https://github.com/Bayyana-kiran) | ||
|