Skip to content

Commit

Permalink
Movie data Visualization #282 (#311)
Browse files Browse the repository at this point in the history
# Pull Request for PyVerse 💡

## Issue Title: Movie Data Visualization

- **Info about the related issue (Aim of the project)**: To understand
movie data by using data visualization.
- **Name**: Ananya Ravikiran Vastare
- **GitHub ID**: Ananya-vastare
- **Email ID**: ananyarvastare@gmail.com
- **Identify yourself**: GSSOC contributor 

Closes: #282 PR

### Describe the add-ons or changes you've made 📃

I created a new project for data visualization, focusing on enhancing
the understanding of movie datasets through various visualization
techniques.

## Type of change ☑️

What sort of change have you made:
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Code style update (formatting, local variables)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] This change requires a documentation update

## How Has This Been Tested? ⚙️

The application has been tested with various movie datasets to ensure
that visualizations are generated correctly. User interactions have been
validated to confirm that the application responds appropriately to
different inputs.

## Checklist: ☑️
- [x] My code follows the guidelines of this project.
- [x] I have performed a self-review of my own code.
- [x] I have commented my code, particularly wherever it was hard to
understand.
- [x] I have made corresponding changes to the documentation.
- [x] My changes generate no new warnings.
- [x] I have added things that prove my fix is effective or that my
feature works.
- [ ] Any dependent changes have been merged and published in downstream
modules.
  • Loading branch information
UTSAVS26 authored Oct 10, 2024
2 parents a2eb8d8 + 7c558a2 commit 0ed4a7f
Show file tree
Hide file tree
Showing 3 changed files with 141 additions and 0 deletions.
Binary file not shown.
53 changes: 53 additions & 0 deletions DataVizLearnig/Movie Data Visualization/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Movie Data Analysis Application

## Goal
To create an interactive application that analyzes movie data, including reviews and meta scores, and presents the information visually through various types of graphs.

## Description
This project implements a movie data analysis application using Streamlit, allowing users to explore movie ratings and trends through visualizations. Users can select from different graph types to visualize the movie dataset, facilitating a better understanding of the underlying data trends.

## What I Have Done
1. Developed an interactive user interface using Streamlit for visualizing movie data.
2. Implemented functions to create various visualizations: bar charts, pie charts, and histograms.
3. Enabled dynamic data loading from an Excel file containing movie ratings and metadata.
4. Organized code into functions for better clarity and maintainability.
5. Incorporated basic error handling for data input and visualization selection.

## Models Implemented
This project primarily focuses on:
- **Data Visualization** using Matplotlib and Seaborn libraries.
- **User Interaction** through a simple command-line interface powered by Streamlit.
- **Data Processing** using Pandas for efficient manipulation of movie datasets.

## Libraries Needed
- `streamlit`
- `pandas`
- `matplotlib`
- `seaborn`
- `openpyxl` (if using Excel files)

## Usage
Run the Streamlit application using the following command:

```bash
streamlit run your_script_name.py
# or
python -m streamlit run yourprojectname.py

Make sure to change the directory of the Excel file before you run it.

## How to Use
- Start the application using the command above.
- Upon loading the application, users will see a title and a description of the project.
- Load your movie dataset from an Excel file.
- Choose a visualization type (Bar Chart, Pie Chart, Histogram) from the dropdown menu.
- Click the "Submit" button to generate the selected graph based on the movie data.

## Conclusion
This Movie Data Analysis Application serves as an educational tool for understanding data visualization techniques and provides a practical introduction to using Streamlit for building interactive applications. Its design is tailored for beginners to explore data analysis concepts effectively while leveraging Python's powerful data processing libraries.
## License
This project is licensed under the MIT License. See the LICENSE file for more details.
## Contributing
Feel free to submit issues or pull requests if you have suggestions for improvements or new features!
88 changes: 88 additions & 0 deletions DataVizLearnig/Movie Data Visualization/maincode.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set the title of the application
st.title("Movie Data analysis")
st.write(
"This project analyses the movie data such as ,reviews,meta score etc and displays them in the form of a graph"
)

# Load your dataset
data = pd.read_excel(
r"C:\Users\Ananya\OneDrive\Documents\GitHub\PyVerse\DataVizLearnig\Movie Data Visualization\MovieRatings.xlsx"
)

# Create a dropdown menu for selecting options
options = ["Barchart", "PieChart", "Histogram"]
selected_option = st.selectbox("Choose an option", options)


# Function to create and display a bar chart
def show_barchart():
ratings = data["IMDB_Ratings"].value_counts()
plt.figure(figsize=(10, 6))
cmap = plt.get_cmap("Blues")
colors = [cmap(i / len(ratings)) for i in range(5, len(ratings))]

plt.bar(ratings.index.astype(str), ratings.values, width=0.5, color=colors)
plt.xlabel("IMDB Ratings")
plt.ylabel("Frequency")
plt.title("Bar Chart for Ratings vs Range")
st.pyplot(plt)


# Function to create and display a pie chart
def show_piechart():
data["Genre"] = data["Genre"].str.split(", ") # Split genres if multiple
exploded_df = data.explode("Genre")
genre_counts = exploded_df["Genre"].value_counts()

# Create a gradient color from dark pink to light pink
colors = [
(1, 0.08, 0.58, 1 - i / len(genre_counts)) for i in range(len(genre_counts))
]

plt.figure(figsize=(8, 6))
plt.pie(
genre_counts,
colors=colors,
autopct="%1.1f%%",
labels=genre_counts.index, # Add genre names as labels
shadow=True,
wedgeprops={"edgecolor": "black"}, # Black border between slices
)
plt.title("Genre Distribution of Top 1000 Movies")
st.pyplot(plt)


# Function to create and display a histogram
def show_histogram():
# Ensure 'Meta_score' is in the correct format
values = pd.to_numeric(data["Meta_score"], errors="coerce")

# Check for NaN values and handle them (e.g., drop or fill)
values = values.dropna() # Drop NaN values

# Create a histogram for Metascore with KDE
plt.figure(figsize=(10, 6))
sns.histplot(values, bins=30, kde=True, color="orange", stat="density", alpha=0.5)
plt.title("Histogram of Metascore with KDE")
plt.xlabel("Metascore")
plt.ylabel("Density")
plt.grid(axis="y")

st.pyplot(plt)


# Run the appropriate function based on user selection
if st.button("Submit"):
if selected_option == "Barchart":
show_barchart()
elif selected_option == "PieChart":
show_piechart()
elif selected_option == "Histogram":
show_histogram()
else:
st.error("No valid option selected.")

0 comments on commit 0ed4a7f

Please sign in to comment.