Uses algorithms like SNN, KNN, Naive Bayes, Decision Tree, VADER, and BERT to analyse the intricate language used in Shopee reviews, ensuring accurate interpretation of sentiments for Malaysia Rojak.
This project utilises sentiment analysis techniques such as Support Vector Machines, Convolutional Neural Networks, and BERT to evaluate customer feedback from the "W Baju Melayu" store on Shopee. It aims to reveal insights from the diverse mix of Malay and English in customer reviews, enabling informed decisions for enhancing products and services.
- Source: W Baju Melayu store on Shopee W Baju Melayu Shopee
- Size: 400 reviews
- Attributes: Users ID, Ratings (From 1 to 5), Review
- Data Cleaning and Formatting: Removing noise and standardising text.
- Data Translation: Converting text from one language to another.
- Tokenisation: Splitting text into individual words or tokens.
- Normalisation: Standardising text by adjusting formats or cases.
- Pos-Tagging: Assigning parts of speech to each word.
- Data Stemming: Reducing words to their root forms.
- Simple Neural Network: Basic model learning patterns through layers of neurons.
- Naive Bayes: Probabilistic classifier using Bayes' theorem with independent features.
- K-Nearest Neighours(KNN): Classifies based on the majority class of nearest neighbours.
- Decision Tree: Splits data into branches to make decisions or classifications.
- VADER: Rule-based tool for sentiment analysis in social media text.
- BERT: Transformer model for bidirectional natural language understanding.
Logistic regression. Achieved an F1 score of 0.7778, indicating a strong balance between precision and recall.
LSTM model. Displayed the lowest F1 score of 0.3333, suggesting challenges in achieving a harmonious trade-off between precision and recall.
This project forms part of an academic course and is intended solely for educational purposes. It may include references to copyrighted materials and any such materials are utilised exclusively for scholarly use. For guidance on sharing or distributing this work, it is advisable to seek consultation from your instructor or institution.
For more details, see the LICENSE file.
Dataset: W Baju Melayu Shopee Review