-
Notifications
You must be signed in to change notification settings - Fork 214
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
3 changed files
with
140 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Movie Data Analysis Application | ||
|
||
## Goal | ||
To create an interactive application that analyzes movie data, including reviews and meta scores, and presents the information visually through various types of graphs. | ||
|
||
## Description | ||
This project implements a movie data analysis application using Streamlit, allowing users to explore movie ratings and trends through visualizations. Users can select from different graph types to visualize the movie dataset, facilitating a better understanding of the underlying data trends. | ||
|
||
## What I Have Done | ||
1. Developed an interactive user interface using Streamlit for visualizing movie data. | ||
2. Implemented functions to create various visualizations: bar charts, pie charts, and histograms. | ||
3. Enabled dynamic data loading from an Excel file containing movie ratings and metadata. | ||
4. Organized code into functions for better clarity and maintainability. | ||
5. Incorporated basic error handling for data input and visualization selection. | ||
|
||
## Models Implemented | ||
This project primarily focuses on: | ||
- **Data Visualization** using Matplotlib and Seaborn libraries. | ||
- **User Interaction** through a simple command-line interface powered by Streamlit. | ||
- **Data Processing** using Pandas for efficient manipulation of movie datasets. | ||
|
||
## Libraries Needed | ||
- `streamlit` | ||
- `pandas` | ||
- `matplotlib` | ||
- `seaborn` | ||
- `openpyxl` (if using Excel files) | ||
|
||
## Usage | ||
Run the Streamlit application using the following command: | ||
|
||
```bash | ||
streamlit run your_script_name.py | ||
# or | ||
python -m streamlit run yourprojectname.py | ||
|
||
Make sure to change the directory of the Excel file before you run it. | ||
|
||
## How to Use | ||
- Start the application using the command above. | ||
- Upon loading the application, users will see a title and a description of the project. | ||
- Load your movie dataset from an Excel file. | ||
- Choose a visualization type (Bar Chart, Pie Chart, Histogram) from the dropdown menu. | ||
- Click the "Submit" button to generate the selected graph based on the movie data. | ||
|
||
## Conclusion | ||
This Movie Data Analysis Application serves as an educational tool for understanding data visualization techniques and provides a practical introduction to using Streamlit for building interactive applications. Its design is tailored for beginners to explore data analysis concepts effectively while leveraging Python's powerful data processing libraries. | ||
## License | ||
This project is licensed under the MIT License. See the LICENSE file for more details. | ||
## Contributing | ||
Feel free to submit issues or pull requests if you have suggestions for improvements or new features! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
import streamlit as st | ||
import pandas as pd | ||
import matplotlib.pyplot as plt | ||
import seaborn as sns | ||
|
||
# Set the title of the application | ||
st.title("Movie Data analysis") | ||
st.write( | ||
"This project analyses the movie data such as ,reviews,meta score etc and displays them in the form of a graph" | ||
) | ||
|
||
# Load your dataset | ||
data = pd.read_excel( | ||
r"C:\Users\Ananya\OneDrive\Documents\GitHub\PyVerse\DataVizLearnig\Movie Data Visualization\MovieRatings.xlsx" | ||
) | ||
|
||
# Create a dropdown menu for selecting options | ||
options = ["Barchart", "PieChart", "Histogram"] | ||
selected_option = st.selectbox("Choose an option", options) | ||
|
||
|
||
# Function to create and display a bar chart | ||
def show_barchart(): | ||
ratings = data["IMDB_Ratings"].value_counts() | ||
plt.figure(figsize=(10, 6)) | ||
cmap = plt.get_cmap("Blues") | ||
colors = [cmap(i / len(ratings)) for i in range(5, len(ratings))] | ||
|
||
plt.bar(ratings.index.astype(str), ratings.values, width=0.5, color=colors) | ||
plt.xlabel("IMDB Ratings") | ||
plt.ylabel("Frequency") | ||
plt.title("Bar Chart for Ratings vs Range") | ||
st.pyplot(plt) | ||
|
||
|
||
# Function to create and display a pie chart | ||
def show_piechart(): | ||
data["Genre"] = data["Genre"].str.split(", ") # Split genres if multiple | ||
exploded_df = data.explode("Genre") | ||
genre_counts = exploded_df["Genre"].value_counts() | ||
|
||
# Create a gradient color from dark pink to light pink | ||
colors = [ | ||
(1, 0.08, 0.58, 1 - i / len(genre_counts)) for i in range(len(genre_counts)) | ||
] | ||
|
||
plt.figure(figsize=(8, 6)) | ||
plt.pie( | ||
genre_counts, | ||
colors=colors, | ||
autopct="%1.1f%%", | ||
shadow=True, | ||
wedgeprops={"edgecolor": "black"}, # Black border between slices | ||
) | ||
plt.title("Genre Distribution of Top 1000 Movies") | ||
st.pyplot(plt) | ||
|
||
|
||
# Function to create and display a histogram | ||
def show_histogram(): | ||
# Ensure 'Meta_score' is in the correct format | ||
values = pd.to_numeric(data["Meta_score"], errors="coerce") | ||
|
||
# Check for NaN values and handle them (e.g., drop or fill) | ||
values = values.dropna() # Drop NaN values | ||
|
||
# Create a histogram for Metascore with KDE | ||
plt.figure(figsize=(10, 6)) | ||
sns.histplot(values, bins=30, kde=True, color="orange", stat="density", alpha=0.5) | ||
plt.title("Histogram of Metascore with KDE") | ||
plt.xlabel("Metascore") | ||
plt.ylabel("Density") | ||
plt.grid(axis="y") | ||
|
||
st.pyplot(plt) | ||
|
||
|
||
# Run the appropriate function based on user selection | ||
if st.button("Submit"): | ||
if selected_option == "Barchart": | ||
show_barchart() | ||
elif selected_option == "PieChart": | ||
show_piechart() | ||
elif selected_option == "Histogram": | ||
show_histogram() | ||
else: | ||
st.error("No valid option selected.") |