A GUI software for sentimental analysis using python. I have used multiple algorithms and based on those shown independent outputs of every algorithm. On the CLI we can see the respective accuracy of each algorithm and we can analyze which performed the best. The user will input a twitter username and the select number of tweets he wants to analyze. Then he/she will run the program and the output will be shown in a separate window for every tweet!
- Open
test.py
- Run
- When the UI will show up, enter the username of the profile without using @
- Use the slider to set the number of tweets you wanted to fetch
- The result will show up.
Multinomial Naive Bayes, Random Trees Embedding, Random Forest Regressor, Random Forest Classifier, Multinomial Logistic Regression, Linear Support Vector Classifier, Linear Regression, Extra Tree Regressor, Extra Tree Classifier, Decision Tree Classifier, Binary Logistic Regression get training data, testing data with features for which we have to predict our sentiment then we calculate accuracy score, confusion matrix and ROC(Receiver Operating Characteristic) and AUC(Area Under Curve) and return positive or negative emotions.
Get credentials from Twitter Developer Portal Link
- Python
- Install Tweepy for fetching tweets
- Install pandas for data analsysis
- Install sklearn for machine learning algos
Preprocess.py: It contains preprocessing function which performs following steps:-
- It is getting the tweet
- Removes URL using a regular expression.
- Removes emoticons using a regular expression.
- Removes username using a regular expression.
- Removes digit using a regular expression.
- Convert more than 2 letter repetitions to 2 letters.
- Removes symbols.
- Removes extra white spaces.
- Return preprocessed tweet.
twitter_credentials.py: In this file, we store our access token, access token secret, consumer key, and consumer secret.
- The TwitterAuthenticator class inherits the OAuthHandler class and passes in the credentials to allow access to Twitter’s API features.
- The TwitterClient class contains all the methods to interact with Twitter API and parsing tweets. Use init function to handle the authentication of the API client.
- Create a object of class TwitterClient() and use the object to get twitter client API using get_twitter_client_api() function.
- create a window using Tkinter and let the user input the hashtag.
- Use API to search for the tweets of the inputted hashtag and store the tweets.
- Extract the labels and sentences and store the outcomes in y and after preprocessing the tweets store them in x.
- Then used count Vectorizer to lowercases text, performed tokenization (converts raw text to smaller units of text), used word-level tokenization (meaning each word is treated as a separate token), ignored single characters during tokenization.
- Now one iterate the tweets and one by one preprocess and transform the tweets and do predictions.
twitter_credentials.py: In this file we store our access token,access token secret, consumer key and consumer secret.
AllImport.py: This contains all the imported modules in one place so that we don't have to include it in every file, thus reducing the redundancy.