Added Movie data Visualization

Added a new project to this repo
UTSAVS26 · Oct 8, 2024 · 551df14 · 551df14
1 parent c2d33bf
commit 551df14
Show file tree

Hide file tree

Showing 3 changed files with 140 additions and 0 deletions.
diff --git a/DataVizLearnig/Movie Data Visualization/MovieRatings.xlsx b/DataVizLearnig/Movie Data Visualization/MovieRatings.xlsx
diff --git a/DataVizLearnig/Movie Data Visualization/Readme.md b/DataVizLearnig/Movie Data Visualization/Readme.md
@@ -0,0 +1,53 @@
+# Movie Data Analysis Application
+
+## Goal
+To create an interactive application that analyzes movie data, including reviews and meta scores, and presents the information visually through various types of graphs.
+
+## Description
+This project implements a movie data analysis application using Streamlit, allowing users to explore movie ratings and trends through visualizations. Users can select from different graph types to visualize the movie dataset, facilitating a better understanding of the underlying data trends.
+
+## What I Have Done
+1. Developed an interactive user interface using Streamlit for visualizing movie data.
+2. Implemented functions to create various visualizations: bar charts, pie charts, and histograms.
+3. Enabled dynamic data loading from an Excel file containing movie ratings and metadata.
+4. Organized code into functions for better clarity and maintainability.
+5. Incorporated basic error handling for data input and visualization selection.
+
+## Models Implemented
+This project primarily focuses on:
+- **Data Visualization** using Matplotlib and Seaborn libraries.
+- **User Interaction** through a simple command-line interface powered by Streamlit.
+- **Data Processing** using Pandas for efficient manipulation of movie datasets.
+
+## Libraries Needed
+- `streamlit`
+- `pandas`
+- `matplotlib`
+- `seaborn`
+- `openpyxl` (if using Excel files)
+
+## Usage
+Run the Streamlit application using the following command:
+
+```bash
+streamlit run your_script_name.py
+# or 
+python -m streamlit run yourprojectname.py
+
+Make sure to change the directory of the Excel file before you run it.
+
+## How to Use
+- Start the application using the command above.
+- Upon loading the application, users will see a title and a description of the project.
+- Load your movie dataset from an Excel file.
+- Choose a visualization type (Bar Chart, Pie Chart, Histogram) from the dropdown menu.
+- Click the "Submit" button to generate the selected graph based on the movie data.
+
+## Conclusion
+This Movie Data Analysis Application serves as an educational tool for understanding data visualization techniques and provides a practical introduction to using Streamlit for building interactive applications. Its design is tailored for beginners to explore data analysis concepts effectively while leveraging Python's powerful data processing libraries.
+
+## License
+This project is licensed under the MIT License. See the LICENSE file for more details.
+
+## Contributing
+Feel free to submit issues or pull requests if you have suggestions for improvements or new features!
diff --git a/DataVizLearnig/Movie Data Visualization/maincode.py b/DataVizLearnig/Movie Data Visualization/maincode.py
@@ -0,0 +1,87 @@
+import streamlit as st
+import pandas as pd
+import matplotlib.pyplot as plt
+import seaborn as sns
+
+# Set the title of the application
+st.title("Movie Data analysis")
+st.write(
+    "This project analyses the movie data such as ,reviews,meta score etc and displays them in the form of a graph"
+)
+
+# Load your dataset
+data = pd.read_excel(
+    r"C:\Users\Ananya\OneDrive\Documents\GitHub\PyVerse\DataVizLearnig\Movie Data Visualization\MovieRatings.xlsx"
+)
+
+# Create a dropdown menu for selecting options
+options = ["Barchart", "PieChart", "Histogram"]
+selected_option = st.selectbox("Choose an option", options)
+
+
+# Function to create and display a bar chart
+def show_barchart():
+    ratings = data["IMDB_Ratings"].value_counts()
+    plt.figure(figsize=(10, 6))
+    cmap = plt.get_cmap("Blues")
+    colors = [cmap(i / len(ratings)) for i in range(5, len(ratings))]
+
+    plt.bar(ratings.index.astype(str), ratings.values, width=0.5, color=colors)
+    plt.xlabel("IMDB Ratings")
+    plt.ylabel("Frequency")
+    plt.title("Bar Chart for Ratings vs Range")
+    st.pyplot(plt)
+
+
+# Function to create and display a pie chart
+def show_piechart():
+    data["Genre"] = data["Genre"].str.split(", ")  # Split genres if multiple
+    exploded_df = data.explode("Genre")
+    genre_counts = exploded_df["Genre"].value_counts()
+
+    # Create a gradient color from dark pink to light pink
+    colors = [
+        (1, 0.08, 0.58, 1 - i / len(genre_counts)) for i in range(len(genre_counts))
+    ]
+
+    plt.figure(figsize=(8, 6))
+    plt.pie(
+        genre_counts,
+        colors=colors,
+        autopct="%1.1f%%",
+        shadow=True,
+        wedgeprops={"edgecolor": "black"},  # Black border between slices
+    )
+    plt.title("Genre Distribution of Top 1000 Movies")
+    st.pyplot(plt)
+
+
+# Function to create and display a histogram
+def show_histogram():
+    # Ensure 'Meta_score' is in the correct format
+    values = pd.to_numeric(data["Meta_score"], errors="coerce")
+
+    # Check for NaN values and handle them (e.g., drop or fill)
+    values = values.dropna()  # Drop NaN values
+
+    # Create a histogram for Metascore with KDE
+    plt.figure(figsize=(10, 6))
+    sns.histplot(values, bins=30, kde=True, color="orange", stat="density", alpha=0.5)
+    plt.title("Histogram of Metascore with KDE")
+    plt.xlabel("Metascore")
+    plt.ylabel("Density")
+    plt.grid(axis="y")
+
+    st.pyplot(plt)
+
+
+# Run the appropriate function based on user selection
+if st.button("Submit"):
+    if selected_option == "Barchart":
+        show_barchart()
+    elif selected_option == "PieChart":
+        show_piechart()
+    elif selected_option == "Histogram":
+        show_histogram()
+    else:
+        st.error("No valid option selected.")