Skip to content

Latest commit

 

History

History
35 lines (21 loc) · 955 Bytes

README.md

File metadata and controls

35 lines (21 loc) · 955 Bytes

ETL Project: real estate transaction

Architecture

architecture

Description

Use Airflow to establish an ETL pipeline.

Data source: Taichung actual price registration of real estate transaction in 2023.

ETL flow

  1. Extract raw data from Taichung opendata.
  2. Transform raw data
    • Remove special transaction data to avoid influencing information interpretation.
    • Add a new column to group the house ages every 10 years for easier data visualization in the future.
  3. Load processed data to database (PostgreSQL)

How to use

$ docker compose up -d

It requires to set up airflow connections for the data source and database.

Just run ./airflow_conn_init.sh in any airflow docker container.

Screenshot

  1. DAG view DAG

  2. Data loaded to database db