Skip to content

In this repo, you can launch a remote server for Twitter sentiment analysis web application using Tweepy and Flask

Notifications You must be signed in to change notification settings

ajinChen/twitter-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Twitter Sentiment Analysis Web Application

Intro

The motivation of this repo is to build a remote web server on AWS which can pull twitter data using the tweepy wrapper around the twitter API and perform simple sentiment analysis using the vaderSentiment library.

This web application can display the most recent 100 tweets from a given user with red to green gradient color corresponding to the sentiment score and the list of most popular users followed by a given user.

Requirements

There are some general library requirements for twitter data mining and launching remote server. The general requirements are as follows.

  • numpy
  • tweepy
  • vaderSentiment
  • flask
  • gunicorn

Note: It is recommended to use Anaconda distribution of Python.

Usage

The whole directory structure of this project show below:

$ project_dir
├── tweetie.py
├── server.py
├── tweepy_key.csv (create from user)
├── image.ico
├── templates
│   ├── following.html
│   └── tweets.html
└── IP.txt
  • tweetie.py : a series of functions for Twitter data mining
  • server.py : Main file to create server
  • tweepy_key.csv : authentication credentials
  • image.ico : Web icon image
  • templates : Front-end templates
  • IP.txt : the public IP address of your server, such as 54.176.42.135:5000

Data Mining for tweets

In order to get authenticating from Twitter, users need to register in twitter app management and create an app. Twitter will give you the authentication credentials which are needed for making requests to the API server.

  • Consumer Key (API Key)
  • Consumer Secret (API Secret)
  • Access Token
  • Access Token Secret

To prevent having to type that every time, users should store those keys in a CSV file under the folder and users can safety run the server like the partial code below:

consumer_key consumer_secret access_token access_token_secret
def authenticate(twitter_auth_filename):
    """
    Given a file name containing the Twitter keys and tokens,
    create and return a tweepy API object.
    """
    keys = loadkeys(twitter_auth_filename)
    auth = tweepy.OAuthHandler(keys[0], keys[1])
    auth.set_access_token(keys[2], keys[3])
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api

Function fetch_tweets() will fetch a list of tweets for a given user in dictionary format:

  • user: user's screen name
  • count: number of tweets
  • tweets: list of tweets

where each tweet is a dictionary containing:

  • id: tweet ID
  • created: tweet creation date
  • retweeted: number of retweets
  • text: text of the tweet (use twitter tweet field text not full_text please)
  • hashtags: list of hashtags mentioned in the tweet
  • urls: list of URLs mentioned in the tweet
  • mentions: list of screen names mentioned in the tweet
  • score: the "compound" polarity score from vader's polarity_scores()
def fetch_tweets(api, name):
    ...
    user_dict = {}
    user_obj = api.get_user(screen_name=name)
    count = 100
    tweets_obj = tweepy.Cursor(api.user_timeline, id=name).items(count)
    for tweet in tweets_obj:
        tweet_dict = {}
        tweet_dict['id'] = tweet.id
        tweet_dict['retweeted'] = tweet.retweet_count
        ...
    return user_dict

Functionfetch_following() will fetch a list of followers' info for a given user in dictionary format:

  • name: user's real name
  • screen_name: Twitter screen name
  • followers: number of followers
  • created: created date (no time info)
  • image: the URL of the profile's image
def fetch_following(api,name):
    follusr = []
    count = 100
    friends_list = tweepy.Cursor(api.get_friends, id=name).items(count)
    for friend in friends_list:
        f_dict = {}
        f_dict['name'] = friend.name
        f_dict['screen_name'] = friend.screen_name
        f_dict['followers'] = friend.followers_count
        f_dict['created'] = friend.created_at.date()
        f_dict['image'] = friend.profile_image_url_https
        follusr.append(f_dict)
    return follusr

Generating Front-end pages

jinja2 is a template engine that built-in with Flask. Every time you call render_template() from a flask route method, it looks in the templates subdirectory for the file indicated in that function call. There are two different page templates writing in HTML for web front-end.

Here is what the main body of tweet's HTML looks like:

<body>
<h1 style="font-size:130%; font-family:Verdana, sans-serif;">Tweets for {{results['user']}}</h1>
<p style="font-size:80%; font-family:Verdana, sans-serif;">Median sentiment score: {{results['medi_score']}} (-1.0 negative to 1.0 positive)
<ul>
    {% for element in results['tweets'] %}
        <li style="list-style:square; font-size:70%; font-family:Verdana, sans-serif; color:{{element['color']}}">
            {{element['score']}}: <a style="color:{{element['color']}}" href="https://twitter.com/{{results['user']}}/status/{{element['id']}}">{{element['text']}}</a>
        </li>
    {% endfor %}
</ul>
</body>

Launching AWS web server

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. Launching a Linux instance at AWS EC2 as web server which can keep it running in the cloud waiting for responses from other users.

Select image "Deep Learning AMI (Ubuntu 18.04) Version 49.0" and create a t2.medium size computer.

Login your EC2 instance from local terminal:

$ ssh -i "ajin_server.pem" ubuntu@somemachineIPorname

After connect to remote server, install necessary packages:

source activate python3
pip install --upgrade pip
pip install numpy Flask vaderSentiment tweepy
conda install gunicorn

Clone the repository into the home directory:

cd ~
git clone https://github.com/ajinChen/twitter-sentiment-analysis.git
cd twitter-sentiment-analysis

Run your web application on EC2:

$ gunicorn -D --threads 1 -b 0.0.0.0:5000 --access-logfile server.log --timeout 60 server:app tweepy_key.csv

All output goes into server.log, even after you log out. The -D means put the server in daemon mode, which runs the background.

Congregation! You successfully launch the web server for Twitter sentiment analysis.

About

In this repo, you can launch a remote server for Twitter sentiment analysis web application using Tweepy and Flask

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published