You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it reasonable to directly call generate_schema_name() to fetch the final target suffixed dataset name, rather than target.dataset in models?
Context
I'm looking to override generate_schema_name within my project in order to better support some integration tests which will need a dynamic dataset name/suffix. Originally we implemented this by modifying the schema field of a "integration_test" profile in profiles.yaml at build-time. However, this only allows for a single testing dataset, and we want to allow independent parallel runs. We could dynamically generate a profiles.yml file per run, but this feels complex and cumbersome.
Instead, I've updated generate_schema_name() to add a suffix based on a dbt var "integration_test_suffix". However, I'm now wondering the best practice for any manual references to the target dataset.
For example, in some places we create UDF via CREATE OR REPLACE FUNCTION {{ target.schema }}.some_func. I'm wondering if it would be good practice to replace this with CREATE OR REPLACE FUNCTION {{ generate_schema_name() }}.some_func (since we want to create the UDFs within the integration test dataset). In particular, manually invoking generate_schema_name seems a bit weird as I have to provide the appropriate arguments (custom schema name, and the current node).
Is there a better way to access the actual target dataset after the custom wrapper is applied, rather than the original target.dataset?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Short Question
Is it reasonable to directly call
generate_schema_name()
to fetch the final target suffixed dataset name, rather thantarget.dataset
in models?Context
I'm looking to override generate_schema_name within my project in order to better support some integration tests which will need a dynamic dataset name/suffix. Originally we implemented this by modifying the schema field of a "integration_test" profile in profiles.yaml at build-time. However, this only allows for a single testing dataset, and we want to allow independent parallel runs. We could dynamically generate a profiles.yml file per run, but this feels complex and cumbersome.
Instead, I've updated generate_schema_name() to add a suffix based on a dbt var "integration_test_suffix". However, I'm now wondering the best practice for any manual references to the target dataset.
For example, in some places we create UDF via
CREATE OR REPLACE FUNCTION {{ target.schema }}.some_func
. I'm wondering if it would be good practice to replace this withCREATE OR REPLACE FUNCTION {{ generate_schema_name() }}.some_func
(since we want to create the UDFs within the integration test dataset). In particular, manually invoking generate_schema_name seems a bit weird as I have to provide the appropriate arguments (custom schema name, and the current node).Is there a better way to access the actual target dataset after the custom wrapper is applied, rather than the original
target.dataset
?Beta Was this translation helpful? Give feedback.
All reactions