This repo contains the materials for the pipelines tutorial on Pycon -> from scripts soups to Airflow.
The tutorial covers:
- Setting up local databases
- Creating basic ETL pipelines in Python: query APIs, load data to databases, perform data cleaning and filtering and persist the consumption ready data
- How to set a local instance of Airflow and get it running
- Creating basic DAGS in Airflow
- Transform script soups ETLS into Airflow dags
- Set up an Airflow instance in Azure
To add:
- Setting a Kubernetes powered instance on Azure AKS
- Adding CI/CD to using Azure pipelines
If you are interested in following along visit: https://airflow-tutorial.readthedocs.io/en/latest/
The setup instructions can be found at: https://airflow-tutorial.readthedocs.io/en/latest/setup.html
If you would like to experiment with Azure follow this link to get a free trial subscription with 150 dollars.
๐ PRs and Issues are welcome
This repo is licensed using a CC-BY so you are free to use, remix, and share so long attribution is provided to the original author.