Skip to content

Commit

Permalink
Merge pull request #157 from halcyon-past/Whatsapp
Browse files Browse the repository at this point in the history
Added Data Science Project - Whatsapp Chat Analyzer
  • Loading branch information
UTSAVS26 authored Oct 7, 2024
2 parents 3e932ac + 88a4b2f commit befbf2d
Show file tree
Hide file tree
Showing 15 changed files with 431 additions and 0 deletions.
78 changes: 78 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
## **WhatsApp Chat Analyzer**

### 🎯 **Goal**

The main goal of this project is to analyze WhatsApp chat data to derive meaningful insights, such as user engagement, emoji usage, and response patterns. It aims to provide an interactive visualization to explore user activity within a chat.

### 🧵 **Dataset**

The dataset is provided by the user in the form of a `.txt` or `.zip` file, which contains WhatsApp chat data exported from the messaging app.

- Sample data has been provided [HERE](./Sample_Data/)

### 🧾 **Description**

This project allows users to upload WhatsApp chat files, which are then processed to extract useful information, including message counts, emoji usage, and response times. The data is visualized using interactive charts and plots, enabling easy exploration of user behavior in the chat.

### 🧮 **What I had done!**

1. Implemented a file upload feature to allow users to upload WhatsApp chat files (`.txt` or `.zip`).
2. Processed the uploaded chat data to extract relevant information using regular expressions.
3. Extracted details such as date, time, author, message content, and emojis.
4. Analyzed message data to calculate statistics like messages sent, average response time, word count, and emoji usage.
5. Visualized the extracted information using various charts such as pie charts, bar charts, and polar plots.
6. Generated a word cloud of the most frequently used words in the chat.

### 🚀 **Models Implemented**

No machine learning models were used in this project. Instead, the focus is on text processing, data analysis, and visualization of chat data.

### 📚 **Libraries Needed**

- `streamlit` - For creating the web application.
- `pandas` - For handling and processing data.
- `plotly.express` and `plotly.graph_objs` - For creating interactive visualizations.
- `matplotlib.pyplot` and `seaborn` - For additional plotting and visualization.
- `wordcloud` - For generating a word cloud.
- `nltk` - For stopwords and text processing.
- `emojis` - For extracting emojis from messages.
- `collections.Counter` - For counting occurrences of emojis.
- `numpy` - For numerical operations.
- `requests` - For downloading custom stopwords.
- `re` - For regular expressions.
- `zipfile`, `io` - For handling file uploads in `.zip` format.

### 📊 **Exploratory Data Analysis Results**

Below are some visualizations derived from the chat data:

1. **Basic Information**
![Sample Messages](images/sample_messages.png)

2. **Emoji Distribution**
![Emoji Distribution Pie Chart](images/emoji_distribution.png)

3. **Emoji Usage by Author**
![Emoji Usage Bar Chart](images/emoji_usage_author.png)

4. **Top 10 Days With Most Messages**
![Messages Bar Chart](images/top_days_messages.png)

5. **Message Distribution by Day**
![Messages Polar Chart](images/message_distribution.png)

6. **Word Cloud**
![Word Cloud](images/word_cloud.png)

### 📈 **Performance of the Models based on the Accuracy Scores**

No machine learning models were used in this project. It is purely focused on data analysis and visualization of WhatsApp chat data.

### 📢 **Conclusion**

This project successfully provides an interactive analysis of WhatsApp chat data, including statistics on user engagement, emoji usage, and response times. Users can explore individual and group message patterns, and the visualizations make it easy to understand user behavior. This type of analysis can be helpful for studying group dynamics or understanding communication patterns.

### ✒️ **Your Signature**

Created by Aritro Saha -
[Website](https://aritro.tech/) | [GitHub](https://github.com/halcyon-past) | [LinkedIn](https://www.linkedin.com/in/aritro-saha/)
22 changes: 22 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/Sample_Data/sample_data_1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
24/01/24, 12:22 pm - John Doe: Hey there! 🌟
24/01/24, 12:22 pm - Jane Smith: How’s it going? 😊
24/01/24, 12:23 pm - John Doe: Just chilling! What about you? 🛋️
24/01/24, 12:23 pm - Jane Smith: Same here! Watching a movie 🎬
24/01/24, 12:24 pm - John Doe: Which one? 🍿
24/01/24, 12:25 pm - Jane Smith: A new action flick! 💥
24/01/24, 12:26 pm - John Doe: Nice! I love action movies! 🥋
24/01/24, 12:27 pm - Jane Smith: Me too! 😍
24/01/24, 12:28 pm - John Doe: Let’s watch together sometime! 🎥
24/01/24, 12:29 pm - Jane Smith: Definitely! 👍
24/01/24, 12:30 pm - John Doe: What snacks should we get? 🍕
24/01/24, 12:31 pm - Jane Smith: Popcorn and soda! 🥤
24/01/24, 12:32 pm - John Doe: Sounds perfect! 😋
24/01/24, 12:33 pm - Jane Smith: Can’t wait! ⏳
24/01/24, 12:34 pm - John Doe: Same! 🎉
24/01/24, 12:35 pm - Jane Smith: Let’s set a date! 📅
24/01/24, 12:36 pm - John Doe: How about Friday? 🤔
24/01/24, 12:37 pm - Jane Smith: Friday works! ✅
24/01/24, 12:38 pm - John Doe: Awesome! 🙌
24/01/24, 12:39 pm - Jane Smith: Looking forward to it! 🥳
24/01/24, 12:40 pm - John Doe: Same here! 🌈
24/01/24, 12:41 pm - Jane Smith: Talk later! 👋
15 changes: 15 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/Sample_Data/sample_data_2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
24/01/24, 1:01 pm - Mike Johnson: What’s up? 🌞
24/01/24, 1:02 pm - Emily Davis: Not much! Just working! 💻
24/01/24, 1:03 pm - Mike Johnson: Same! Just finished a project. 📈
24/01/24, 1:04 pm - Emily Davis: Nice! Celebrate later? 🎊
24/01/24, 1:05 pm - Mike Johnson: Sounds great! 🎉
24/01/24, 1:06 pm - Emily Davis: Where should we go? 🍔
24/01/24, 1:07 pm - Mike Johnson: How about that new cafe? ☕
24/01/24, 1:08 pm - Emily Davis: I’ve heard good things! 👍
24/01/24, 1:09 pm - Mike Johnson: Let’s do it! 🥳
24/01/24, 1:10 pm - Emily Davis: What time? 🕒
24/01/24, 1:11 pm - Mike Johnson: 6 PM? ⏰
24/01/24, 1:12 pm - Emily Davis: Perfect! 🤗
24/01/24, 1:13 pm - Mike Johnson: Can’t wait! 🎈
24/01/24, 1:14 pm - Emily Davis: Me neither! 🎊
24/01/24, 1:15 pm - Mike Johnson: Talk to you later! 👋
16 changes: 16 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/Sample_Data/sample_data_3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
24/01/24, 2:00 pm - Sarah Brown: Just got back from the gym! 🏋️
24/01/24, 2:01 pm - Kevin White: Nice! How was it? 💪
24/01/24, 2:02 pm - Sarah Brown: Great! Feeling pumped! 🔥
24/01/24, 2:03 pm - Kevin White: Awesome! What did you do? 🏃
24/01/24, 2:04 pm - Sarah Brown: Cardio and weights! 🏋️‍♀️
24/01/24, 2:05 pm - Kevin White: I need to hit the gym too! 🕔
24/01/24, 2:06 pm - Sarah Brown: Yes! Let’s go together! 🤝
24/01/24, 2:07 pm - Kevin White: When are you free? 📅
24/01/24, 2:08 pm - Sarah Brown: Tomorrow evening? 🌇
24/01/24, 2:09 pm - Kevin White: Sounds good! 🎯
24/01/24, 2:10 pm - Sarah Brown: Great! Let’s crush it! 💥
24/01/24, 2:11 pm - Kevin White: Yes! Motivation on! 🏆
24/01/24, 2:12 pm - Sarah Brown: I’ll bring the energy! ⚡
24/01/24, 2:13 pm - Kevin White: I’ll bring the snacks! 🍌
24/01/24, 2:14 pm - Sarah Brown: Perfect combo! 🙌
24/01/24, 2:15 pm - Kevin White: Talk later! 👋
13 changes: 13 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/Sample_Data/sample_data_4.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
24/01/24, 3:00 pm - Jessica Taylor: Just finished a book! 📚
24/01/24, 3:01 pm - Brian Miller: Which one? 🤔
24/01/24, 3:02 pm - Jessica Taylor: A mystery novel! 🔍
24/01/24, 3:03 pm - Brian Miller: Cool! I love mysteries! 😍
24/01/24, 3:04 pm - Jessica Taylor: You should read it! 📖
24/01/24, 3:05 pm - Brian Miller: I will! Any recommendations? 💡
24/01/24, 3:06 pm - Jessica Taylor: Definitely! I’ll send you the link! 🔗
24/01/24, 3:07 pm - Brian Miller: Thanks! 😊
24/01/24, 3:08 pm - Jessica Taylor: Let’s discuss it after you read! 🗣️
24/01/24, 3:09 pm - Brian Miller: Sounds good! 🥳
24/01/24, 3:10 pm - Jessica Taylor: Can’t wait! 🎊
24/01/24, 3:11 pm - Brian Miller: Me neither! 🙌
24/01/24, 3:12 pm - Jessica Taylor: Talk soon! 👋
14 changes: 14 additions & 0 deletions Data_Science/Whatsapp_Chat_Analyzer/Sample_Data/sample_data_5.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
24/01/24, 4:00 pm - Alex Wilson: Heading out for a walk! 🚶
24/01/24, 4:01 pm - Mia Clark: Nice! Enjoy! 🌳
24/01/24, 4:02 pm - Alex Wilson: Thanks! 🌼
24/01/24, 4:03 pm - Mia Clark: Any plans for the weekend? 📅
24/01/24, 4:04 pm - Alex Wilson: Thinking of a picnic! 🥪
24/01/24, 4:05 pm - Mia Clark: That sounds fun! ☀️
24/01/24, 4:06 pm - Alex Wilson: Want to join? 🎉
24/01/24, 4:07 pm - Mia Clark: Yes! Count me in! ✌️
24/01/24, 4:08 pm - Alex Wilson: Great! I’ll bring snacks! 🍩
24/01/24, 4:09 pm - Mia Clark: I’ll bring drinks! 🥤
24/01/24, 4:10 pm - Alex Wilson: Perfect combo! 🎯
24/01/24, 4:11 pm - Mia Clark: Can’t wait for Saturday! ⏳
24/01/24, 4:12 pm - Alex Wilson: Me neither! 🥳
24/01/24, 4:13 pm - Mia Clark: See you then! 👋
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit befbf2d

Please sign in to comment.