The goal of this project is to develop a dockerized data pipeline with following steps:
- Collecting tweets with a Python script
- Storing tweets in a MongoDB database
- ETL Job: Extracting the tweets from MongoDB, performing a sentiment analysis of the tweets and stroing the results in a second database (Postgres)
- Loading the tweets and the tweets sentiment in a Postgres database
The pipeline should look like this in the Docker Desktop:
This is what the Postgres DB with the tweets and corresponding sentiment score could look like:
- Finish the Slack bot and add it to the project description