Sentimental-Analysis-using-Amazon-Review-Data

Predicting a review's numeric rating based on the textual review is a quintessential multiclass text classification problem and an interesting research topic in Natural Language Processing. New advances in NLP including the development of Glove, and Word2Vec, have increased the range of approaches available to address this question, and insights gained on this problem can generalize across sentiment analysis and NLP multiclass classification problems in general. We leveraged the extensive corpus of multi-class labeled Amazon data to apply sentiment analysis.

Objective:

 Given a text book review, predict one of the three (positive, neutral, negative) sentiment classes.

This project was completed using Jupyter Notebook and Python with Pandas, NumPy, Matplotlib, Scikit-Learn, nltk, spacy and textblob.

`Streamlit Deployment` Deployed Link

Features

unique_id
asin
product_name
product_type
helpful rating
title
date
reviewer
reviewer_location
review_text

Exploratory Data Analysis

First, I processed the data of review column with following steps:

1. Lower Casing and Removed punctuations by defining a new function.

2. With nltk, I imported stop words and removed them from data cleaned so far, like-i,me,myself,himself,she etc.

3. I removed frequent words from data by defining a new function.

4. Then I stemmarised the data, which means to presenting the summary of generated data in an easily comprehensible
   and informative manner.
   
5. I lemmatizd the data, which is the process of grouping together the inflected forms of a word so they can be analysed as
   a single item, identified by the words lemma
  
6. Then I removed urls and tags from the data.

7. After all the above data processing steps I calculated the polarity and subjective of each review with analysis labels 
  (positive - greater than 0, negative- less than 0 and neutral- equals to 0)
  
    Positive    30296
    Negative     8659
    Neutral      3045
    
8. I started data modeling, divided the data 33600 training and 8400 testing 

9. After vectorising, I got Accuracy score on training dataset = 0.9995535714285714 and Accuracy on 
  Test dataset =0.9942857142857143
  
10. I performed prediction on test data and got F1 score= 0.9871397964692025

Results:

Conclusion:

This complete notebook represents an exploratory data analysis (EDA) process to put together the patterns related to the Reviews. I have dealt with the reviews data "multi-class labeled Amazon data to apply sentiment analysis" and performed sentimental analysis using the textblob to calculate the (positive, neutral, negative) polarity values. The purpose of this study is to analyze the dataset and obtain important insights form it. After performing the Data preprocessing tasks, I have developed an model using the KNN Algorithm for calculating the accuracy of the model. Further, using the streamlit web application I have deployed my model to make it available to see my work.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Images		Images
Datasets converted.ipynb		Datasets converted.ipynb
Datasets.zip		Datasets.zip
KNNmodel.pkl		KNNmodel.pkl
LICENSE		LICENSE
Poster Multi-Class Text Sentiment Analysis Using Amazon Review Data.pdf		Poster Multi-Class Text Sentiment Analysis Using Amazon Review Data.pdf
README.md		README.md
Sentimental Analysis of Amazon Review.ipynb		Sentimental Analysis of Amazon Review.ipynb
app.py		app.py
books.zip		books.zip
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentimental-Analysis-using-Amazon-Review-Data

Objective:

`Streamlit Deployment` Deployed Link

Features

Exploratory Data Analysis

Results:

Conclusion:

About

Releases

Packages

Languages

License

uttej2001/Multiclass-Text-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Sentimental-Analysis-using-Amazon-Review-Data

Objective:

Streamlit Deployment Deployed Link

Features

Exploratory Data Analysis

Results:

Conclusion:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Streamlit Deployment` Deployed Link

Packages