Program was implemented using Python, Twitter API, Kafka, MongoDB, and Tableau. Refer the report for further implementation details:
View Report
Overview:
- Twitter API is leveraged to obtain information to be processed
- Kafka takes the data and connects the various other components of this pipeline
- MongoDB stores the obtained tweets for later analysis
- Tableau creates meaningful visualizations
Upon examining the visualizations we see a relative concentration of tweets containing the COVID hashtag in the Americas, Europe, and Southern Asia, this seems to line up with expectations of areas that both have a high adoption of twitter and many Covid-19 cases. Further work needs to be done to validate this conclusion though.