Skip to content

A python datapipeline project which builds on pydantic.

License

Notifications You must be signed in to change notification settings

elcolumbio/waipawama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

waipawama

A python datapipeline project which builds on pydantic. Right now i got an overview and i will now double down on the benifits of this approach. For example we can parameterize Data Tests with pydantic objects easily.

I have some new ideas how i want to evolve my datapipelines. We will use the chance to use some modern python. Inspired by other great open source projects, e.g. fastapi.

How to get started

In the process i will learn a lot. Right now the big picture looks like this:

  • Build Pydantic Data Models

  • Use them for testing your DataPipeline

  • Use Pydantic for the configs

  • Build Datapipeline, i want to try out apache-airflow this time

  • Bigquery we will use as datadump

  • Load Data with minimum changes and mainly just don't damage data

  • Use DBT for further transformations and reports

Helpful ressources

About

A python datapipeline project which builds on pydantic.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published