Natural language processing (NLP) and visualization project using scraped twitter data
This repository contains all of the files and code used to build and deploy a website for a Data Science project. The website contains visualizations of various Natural Language Processing (NLP) techniques performed on data gathered by scraping Twitter. The results come from searching #tesla. Also, for comparison, #ford was scraped.
The website can be accessed here: https://tesla-nlp.herokuapp.com/
Here are brief explanations of the visualizations:
Unigram – Unigram Term Frequency Word Cloud.
Bigram – Bigram Term Frequency Word Cloud.
Trigram – Trigram Term Frequency Word Cloud.
TF-IDF – Term Frequency – Inverse Document Frequency Word Cloud.
Sentiment – Sentiment Analysis based on the Bing lexicon. The analysis assigned positive or negative sentiment scores based on a tally of how many positive or negative words (as defined by the Bing lexicon) are present in the corpus.
Comparison – Comparing word clouds from #tesla and #ford.
Emotion – Emotional analysis based on the NRC lexicon. This is a Radar Chart which graphs an area for corresponding emotions.
Elon – This is an additional unigram term frequency word cloud in Elon Musk’s likeness.