- OLTP Database: MySQL
- NoSQL Database: MongoDB
- Production Data Warehouse: MySQL
- Staging Data Warehouse: MySQL
- Business Intelligence Dashboard: Power BI
- Data Pipelines: Apache Airflow
The platform’s functionality is supported by two primary databases:
- MySQL: Stores all transactional data such as inventory and sales.
- MongoDB: Houses all product catalog data.
Data from MySQL and MongoDB is regularly extracted and transferred to the staging data warehouse on MySQL. The production data warehouse resides on MySQL, where the data is prepared for analysis.
The BI team connects to the MySQL production warehouse to create operational dashboards.
Data movement between OLTP, NoSQL, and the data warehouse is managed by ETL pipelines running on Apache Airflow.
Make sure you have docker installed on local machine.
Docker
DockerCompose
1.Clone the repo.
git clone https://github.com/NicolasGonzalezGuignet/data_engineering_e-commerce
2.Run (in the directory where docker compose is located)
docker-compose up airflow-init (to initialize critical services and prevent errors)
docker-compose up
3.Open Airflow web browser
Navigate to `http://localhost:8000/` on the browser.
Activate and trigger DAGs.
4.Access your to UI MongoDB database.
Navigate to `http://localhost:8081/` on the browser
5.In the local terminal, run
docker exec -it <mysql-container-name> bash (to access the terminal inside the container that has the mysql image)
then run
mysql -u root -p example (to access mysql and interact)