Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 1.05 KB

README.md

File metadata and controls

28 lines (20 loc) · 1.05 KB

waipawama

A python datapipeline project which builds on pydantic. Right now i got an overview and i will now double down on the benifits of this approach. For example we can parameterize Data Tests with pydantic objects easily.

I have some new ideas how i want to evolve my datapipelines. We will use the chance to use some modern python. Inspired by other great open source projects, e.g. fastapi.

How to get started

In the process i will learn a lot. Right now the big picture looks like this:

  • Build Pydantic Data Models

  • Use them for testing your DataPipeline

  • Use Pydantic for the configs

  • Build Datapipeline, i want to try out apache-airflow this time

  • Bigquery we will use as datadump

  • Load Data with minimum changes and mainly just don't damage data

  • Use DBT for further transformations and reports

Helpful ressources