Skip to content

It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualisation.

License

Notifications You must be signed in to change notification settings

princebhatt9588/Twitter-Tweets-Sentiment-Analysis

Repository files navigation

Twitter-Sentiment-Analysis

Introduction

  • Natural Language Processing (NLP) is a prominent area of research in data science, with sentiment analysis being one of its common applications.
  • Sentiment analysis has revolutionized business operations, impacting areas like opinion polls and marketing strategies.
  • NLP enables the rapid processing of large text datasets, saving time compared to manual analysis.

Understand the Problem Statement

  • The objective is to detect hate speech in tweets, classifying them as racist/sexist (label '1') or non-racist/sexist (label '0').
  • The evaluation metric for this task is the F1-Score.

Tweets Preprocessing and Cleaning

  • Preprocessing of text data is crucial to ready it for mining and applying machine learning algorithms.
  • Data cleaning involves structuring the data, similar to organizing items in an office space for easy access.
  • The objective is to remove noise, such as punctuation, special characters, numbers, and less relevant terms, from the text.
  • Proper data preprocessing results in a better quality feature space when extracting numeric features.

Story Generation and Visualization from Tweets

  • Exploring and visualizing cleaned tweets is vital for gaining insights.
  • Common questions to consider during exploration:
    • What are the most common words in the entire dataset?
    • What are the most common words in negative and positive tweets?
    • How many hashtags are there in a tweet?
    • Which trends are associated with the dataset and the sentiments?

Conclusion

  • The sentiment analysis approach involved preprocessing, data exploration, and feature extraction using Bag-of-Words and TF-IDF.
  • Models were built using these feature sets to classify tweets.
  • Readers are encouraged to share their experiences and discuss additional methods for feature extraction in the comments or discussion portal.

About

It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualisation.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks