Getting target dataset post-generate_schema_name #9880

alex-statsig · 2024-04-08T22:47:59Z

alex-statsig
Apr 8, 2024

Short Question

Is it reasonable to directly call generate_schema_name() to fetch the final target suffixed dataset name, rather than target.dataset in models?

Context

I'm looking to override generate_schema_name within my project in order to better support some integration tests which will need a dynamic dataset name/suffix. Originally we implemented this by modifying the schema field of a "integration_test" profile in profiles.yaml at build-time. However, this only allows for a single testing dataset, and we want to allow independent parallel runs. We could dynamically generate a profiles.yml file per run, but this feels complex and cumbersome.

Instead, I've updated generate_schema_name() to add a suffix based on a dbt var "integration_test_suffix". However, I'm now wondering the best practice for any manual references to the target dataset.

For example, in some places we create UDF via CREATE OR REPLACE FUNCTION {{ target.schema }}.some_func. I'm wondering if it would be good practice to replace this with CREATE OR REPLACE FUNCTION {{ generate_schema_name() }}.some_func (since we want to create the UDFs within the integration test dataset). In particular, manually invoking generate_schema_name seems a bit weird as I have to provide the appropriate arguments (custom schema name, and the current node).

Is there a better way to access the actual target dataset after the custom wrapper is applied, rather than the original target.dataset?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting target dataset post-generate_schema_name #9880

{{title}}

Replies: 0 comments

Select a reply

Getting target dataset post-generate_schema_name #9880

alex-statsig Apr 8, 2024

Short Question

Context

Replies: 0 comments

alex-statsig
Apr 8, 2024