Skip to content

Mini project to create automated twitter post of wordcloud using Apache Airflow

Notifications You must be signed in to change notification settings

matthewfarant/Analisis-Netizen-Indonesia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 

Repository files navigation

Analisis Netizen Indonesia

Background

This is a mini project to create an automated twitter post using Apache Airflow 🕘
The objective of this project is to observe the daily changes of public (netizen) thoughts towards a certain topic. In the meantime, the daily visualization that will be posted is a wordcloud that represents Indonesian netizens towards the Covid-19 pandemic.

Data Pipeline

Every 9 AM in Indonesia Local Time (WIB), Airflow will execute a container of tasks (called DAG or Directed Acyclic Graph) that include data scraping, data cleaning, wordcloud making, and twitter posting. image The tweets data are scraped via standard twitter API, which will later be cleaned (remove stopwords using NLTK data, remove mentions, links, hashtags, etc) and visualized using wordcloud. This task will be automated everyday, which means you will see a wordcloud post in my twitter account every 9 AM.

Tools that I used

Operating System :

Linux (Ubuntu WSL) 🐧

Softwares:

Python 3.8.10 🐍
Venv
Apache Airflow 2.1.2
VSCode

Libraries:

Airflow
Datetime
Numpy
Pandas
Matplotlib
Tweepy
Re
NLTK
Wordcloud

On Progress... 👷

I'm currently working on the sentiment analysis of the tweets and explore any other types of visualizations that I can use. So, stay tuned!

About

Mini project to create automated twitter post of wordcloud using Apache Airflow

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published