Skip to content

Use Case of a Dimensional Data Warehouse built with dbt and BigQuery.

Notifications You must be signed in to change notification settings

domenicoboris89/analytics-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How to configure a dbt project and connect it to BigQuery

This guild describe all the steps to configure a dbt project and connecting it to Google BigQuery.

Alt text

Navigation

This repository contains the Use Case of a Dimensional DWH build using dbt and bigquery, it contains two folders:

  1. The folder DL_NORTHWIND contains the dataset to model on dbt

  2. the other folder is the name assigned to the dbt project, in this case NT_DWH

dbt installation

  1. Install pyenv to manage Python versions and virtual environments: https://github.com/pyenv/pyenv

  2. Create a new Python environment for the dbt and its big query adapter:

     pyenv virtualenv dbt-bigquery
    
  3. Install dbt-core and dbt-bigquery:

     pip install --upgrade pip
     pip install dbt-core
     pip install dbt-bigquery
    

    and check the version:

     dbt --version
    

    that should be the output message:
    Alt text

  4. Initialize the dbt project and assign it a valid name. In this use case, the name is "NT_DWH" as per Northwind Traders Data Warehouse:

    • run dbt init NT_DWH
    • choose bigquery as database type
    • choose oauth as authenthication method
    • add your GCP project_id
  5. Now execute dbt debug and complete the configuration of your profiles.yml with the connections to the Database

     ~/.dbt/profiles.yml
         my-bigquery-db:
             target: dev
             outputs:
                 dev:
                 type: bigquery
                 method: oauth
                 project: [GCP project id]
                 dataset: [the name of your dbt dataset] # You can also use "schema" here
                 threads: [1 or more, number of parallelism in the tasks execution]
                 <optional_config>: <value>
    
  6. Setup the gcp oauth using gcloud: https://docs.getdbt.com/docs/core/connect-data-platform/bigquery-setup#local-oauth-gcloud-setup:

     gcloud auth application-default login \
     --scopes=https://www.googleapis.com/auth/bigquery,\
     https://www.googleapis.com/auth/drive.readonly,\
     https://www.googleapis.com/auth/iam.test
    
  7. If you work with multiple warehouses you need to tell dbt which profiles.yml to use. There are several ways to do that, for more info check the official documentation: https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles.
    One possibility is to use the the --profiles-dir flag every time you execute a dbt command, followed by 'path/to/directory'.

That's how a well configured dbt project, looks like:

Alt text

About

Use Case of a Dimensional Data Warehouse built with dbt and BigQuery.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published