You're viewing an older version of this GitHub Action. Do you want to see the latest version instead?
GitHub Action
Run checks against DBT Models
v2.2.1
This repository is intended for comparing models
in dbt
that have changed during an open PR.
Note: It only currently supports BigQuery
.
The repository has been published as a Github Action
and PyPi Package
, which means it can be leveraged in a variety of ways:
- Directly in Python via
run_dbt_table_diff
. - Directly in Terminal via
python3 -m dbt_table_diff
. - In a Github Workflow File via
Github Actions
to automatically add comments on Open PRs.
pip3 install dbt_table_diff
from dbt_table_diff import run_dbt_table_diff
run_dbt_table_diff(
project_id="ultimate-bit-359101",
keyfile_path="secrets/bq_keyfile.json",
manifest_file="target/manifest.json",
dev_prefix="dev_",
prod_prefix="prod_",
fallback_prefix="fb_",
custom_checks_path="",
ignored_schemas=[],
irregular_schemas=[],
org_name="org-not-included",
repo_name="dbt_example",
pr_id="2",
auth_token="my_github_pat",
)
python3 -m dbt_table_diff -t $GH_TOKEN -o org-not-included -r dbt_example -l 2 \
--manifest_file 'target/manifest.json' --project_id 'ultimate-bit-359101' \
--keyfile_path 'secrets/bq_keyfile.json' --dev_prefix 'dev_' --prod_prefix 'prod_' --fallback_prefix 'fb_'
Input Parameter | Description |
---|---|
GCP_TOKEN | for connecting to BQ (runs dbt compile and dbt_table_diff/sql_checks to compare tables) |
GH_TOKEN | for connecting to Github (ie. fetches modified models/*.sql in your PR, adds comment on your PR) |
PR_NUMBER | for fetching open PR from github (Pull Request ID [int]) |
GH_REPO | for fetching open PR from github (Repository Name) |
GH_ORG | for fetching open PR from github (Repository owner/organization name) |
DBT_PROFILE_FILE | the local path in your repo to your profile.yml for dbt (this is necessary for compiling manifest.json during setup process) |
dev_prefix | the prefix used when running dbt locally (Your source schema/environment for comparison) |
prod_prefix | the prefix used when running dbt remotely (Your target schema/environment for comparison) |
fallback_prefix | useful if you have an overriden macro for generate_schema_name in your dbt project, which leverages a different prefix for some schemas in prod. |
irregular_schemas | comma separated string of schemas which use fallback_prefix |
project_id | for connecting to BQ (BigQuery Project ID) |
ignored_schemas | comma separated string of schemas to ignore (skip checking during github action) |
- Fetches list of files modified in Pull Request
- by CURLing
github.api.com/repos/{organization}/{repository}/pulls/{pull_request_id}/files
- by CURLing
- Filters on
relevant_files
- which are files matching
models/*.sql
- which are files matching
- Builds
manifest.json
- By running
dbt deps; dbt compile
- By running
- Parses
manifest.json
forrelevant_models
- using manifest-attribute
original_file_path
matchingrelevant_files
- using manifest-attribute
- Runs all SQL files in
dbt_table_diff/sql_checks
- for each of the
relevant_models
, compare the two dbt targets (dev_prefix
vsprod_prefix
)
- for each of the
- Saves output to file
- in a format supported by Github comments
- Posts comment on open PR
- leveraging
py-github-helper
PyPi package
- leveraging
python3 -m dbt_table_diff --help
usage: dbt_table_diff [-h] [-o ORG_NAME] [-r REPO_NAME] [-t AUTH_TOKEN] [-l PR_ID] [--manifest_file MANIFEST_FILE] [--project_id PROJECT_ID] [--keyfile_path KEYFILE_PATH] [--ignored_schemas IGNORED_SCHEMAS]
[--irregular_schemas IRREGULAR_SCHEMAS] [--dev_prefix DEV_PREFIX] [--prod_prefix PROD_PREFIX] [--fallback_prefix FALLBACK_PREFIX] [--custom_checks_path CUSTOM_CHECKS_PATH]
optional arguments:
-h, --help show this help message and exit
-o ORG_NAME, --org_name ORG_NAME
Owner of GitHub repository.
-r REPO_NAME, --repo_name REPO_NAME
Name of the GitHub repository.
-t AUTH_TOKEN, --auth_token AUTH_TOKEN
User's GitHub Personal Access Token.
-l PR_ID, --pr_id PR_ID
The issue # of the Pull Request.
--manifest_file MANIFEST_FILE
The path to dbt's manifest file.
--project_id PROJECT_ID
The BigQuery Project ID to leverage.
--keyfile_path KEYFILE_PATH
The path to the keyfile to use during BQ calls.
--ignored_schemas IGNORED_SCHEMAS
Folders in models/ to always ignore during row/col checks.
--irregular_schemas IRREGULAR_SCHEMAS
Folders in models/ which use 'fallback_prefix' in prod.
--dev_prefix DEV_PREFIX
Prefix used by development datasets in dbt.
--prod_prefix PROD_PREFIX
Prefix used by production datasets in dbt.
--fallback_prefix FALLBACK_PREFIX
Uncommon prefix used by only some production datasets in dbt.
--custom_checks_path CUSTOM_CHECKS_PATH
A local folder containing any custom SQL to run.