This guild describe all the steps to configure a dbt project and connecting it to Google BigQuery.
This repository contains the Use Case of a Dimensional DWH build using dbt
and bigquery
, it contains two folders:
-
The folder DL_NORTHWIND contains the dataset to model on dbt
-
the other folder is the name assigned to the dbt project, in this case NT_DWH
-
Install pyenv to manage Python versions and virtual environments: https://github.com/pyenv/pyenv
-
Create a new Python environment for the dbt and its big query adapter:
pyenv virtualenv dbt-bigquery
-
Install dbt-core and dbt-bigquery:
pip install --upgrade pip pip install dbt-core pip install dbt-bigquery
and check the version:
dbt --version
-
Initialize the dbt project and assign it a valid name. In this use case, the name is "NT_DWH" as per Northwind Traders Data Warehouse:
- run
dbt init NT_DWH
- choose bigquery as database type
- choose oauth as authenthication method
- add your GCP project_id
- run
-
Now execute
dbt debug
and complete the configuration of yourprofiles.yml
with the connections to the Database~/.dbt/profiles.yml my-bigquery-db: target: dev outputs: dev: type: bigquery method: oauth project: [GCP project id] dataset: [the name of your dbt dataset] # You can also use "schema" here threads: [1 or more, number of parallelism in the tasks execution] <optional_config>: <value>
-
Setup the gcp oauth using gcloud: https://docs.getdbt.com/docs/core/connect-data-platform/bigquery-setup#local-oauth-gcloud-setup:
gcloud auth application-default login \ --scopes=https://www.googleapis.com/auth/bigquery,\ https://www.googleapis.com/auth/drive.readonly,\ https://www.googleapis.com/auth/iam.test
-
If you work with multiple warehouses you need to tell dbt which
profiles.yml
to use. There are several ways to do that, for more info check the official documentation: https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles.
One possibility is to use the the--profiles-dir
flag every time you execute a dbt command, followed by 'path/to/directory'.
That's how a well configured dbt project, looks like: