Bug/renamed columns #46

fivetran-catfritz · 2024-07-01T17:37:24Z

PR Overview

This PR will address the following Issue/Feature:

This PR will result in the following new package version:

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

dbt run –full-refresh && dbt test
dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

The appropriate issue has been linked, tagged, and properly assigned
All necessary documentation and version upgrades have been applied
docs were regenerated (unless this PR does not include any code or yml updates)
BuildKite integration tests are passing
Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

If you had to summarize this PR in an emoji, which would it be?

💃

fivetran-joemarkiewicz

@fivetran-catfritz great work on this so far! I really like the creative solution you have applied here to seamlessly coalesce the fields. I was able to do a first pass and just have a few comments that jumped out at me when taking a look.

Let me know if you would like to sync on any of my comments in more detail. Thanks!

fivetran-joemarkiewicz · 2024-07-01T18:32:38Z

macros/coalesce_w_renamed_col.sql

+{% macro coalesce_w_renamed_col(original_column_name, datatype=dbt.type_string()) %}
+{# This macro accomodates Fivetran connectors that keep the original salesforce field naming conventions without underscores #}
+
+{%- set renamed_col = original_column_name.replace('_', '') -%}


Are we 100% sure this is always the behavior? Maybe we run this by the eng team members in the Height ticket to double check and confirm this is a safe assumption.

For example can we be 100% certain that account_source will be the original name and the renamed will be accountsource? I want to be sure there are no exceptions.

It looks like from your individual source macro updates this does seem to be the case, but just double checking.

In our data, all the examples of cols we use follow this pattern (I'll post the col list in height), which also is supported by what engineering shared. There are exceptions, but these cols tend to end with a double underscore likename__c, but we don't use any columns named like that.

That being said we were missing some columns in our data to be 100%, so it would make sense to be able to provide an alternate to the rule. I made the default behavior to to remove underscores for the renamed name, but now an exact name can be passed as an argument if necessary.

That's a great solution! This way if we need to make any adjustments in the future we have a seamless route to do so.

fivetran-joemarkiewicz · 2024-07-01T18:37:08Z

models/salesforce/stg_salesforce__account.sql

-        billing_state,
-        billing_state_code,
-        billing_street,
+        {{ coalesce_w_renamed_col('account_number') }},


This is a very creative solution!

fivetran-joemarkiewicz · 2024-07-01T18:41:59Z

macros/coalesce_w_renamed_col.sql

+coalesce(cast({{ renamed_col }} as {{ datatype }}),
+    cast({{ original_column_name }} as {{ datatype }}))
+    as {{ original_column_name }}


What are your thoughts on possibly allowing some wiggle room if we don't always want the original_column_name to be the final name of the field in the model? For example, what if in the future we have a field is_partner_marketing_action_active (this is a real field that could exist) and the original field name is IsPartnerMarketingActionActive. I don't know if we would exactly want that to be the final field name. Maybe we would want to have the final name be is_pm_action_active.

I know this is not a scenario right now, but maybe something worth future proofing just in case. Thoughts?

I agree this is a good idea and shouldn't be difficult to implement!

fivetran-joemarkiewicz · 2024-07-01T18:46:00Z

macros/coalesce_w_renamed_col.sql

+{%- set renamed_col = original_column_name.replace('_', '') -%}
+
+coalesce(cast({{ renamed_col }} as {{ datatype }}),
+    cast({{ original_column_name }} as {{ datatype }}))


I know Snowflake provides the ability to enable case-sensitivity (e.g. LastActivityDate). I wonder if this FF does not do the same case-insensitivity approach we see with the standard connector when syncing to Snowflake. I have a small feeling that these original field names may sync like this. Do you know if this could be a problem with the approach we are taking here?

Good point, I'll look into it since right now I'm not sure the best way to handle that scenario. In our example data, everything is lowercased, though I do see in the engineering doc that originally it would be camel case. Let me know if you have any suggestions, but otherwise I'll think on this!

fivetran-catfritz

Thank. you for taking a look @fivetran-joemarkiewicz!

fivetran-catfritz · 2024-07-01T19:09:09Z

macros/coalesce_w_renamed_col.sql

+coalesce(cast({{ renamed_col }} as {{ datatype }}),
+    cast({{ original_column_name }} as {{ datatype }}))
+    as {{ original_column_name }}


I agree this is a good idea and shouldn't be difficult to implement!

fivetran-catfritz · 2024-07-01T19:36:36Z

macros/coalesce_w_renamed_col.sql

+{% macro coalesce_w_renamed_col(original_column_name, datatype=dbt.type_string()) %}
+{# This macro accomodates Fivetran connectors that keep the original salesforce field naming conventions without underscores #}
+
+{%- set renamed_col = original_column_name.replace('_', '') -%}


In our data, all the examples of cols we use follow this pattern (I'll post the col list in height), which also is supported by what engineering shared. There are exceptions, but these cols tend to end with a double underscore likename__c, but we don't use any columns named like that.

That being said we were missing some columns in our data to be 100%, so it would make sense to be able to provide an alternate to the rule. I made the default behavior to to remove underscores for the renamed name, but now an exact name can be passed as an argument if necessary.

fivetran-catfritz · 2024-07-01T19:40:34Z

macros/coalesce_w_renamed_col.sql

+{%- set renamed_col = original_column_name.replace('_', '') -%}
+
+coalesce(cast({{ renamed_col }} as {{ datatype }}),
+    cast({{ original_column_name }} as {{ datatype }}))


Good point, I'll look into it since right now I'm not sure the best way to handle that scenario. In our example data, everything is lowercased, though I do see in the engineering doc that originally it would be camel case. Let me know if you have any suggestions, but otherwise I'll think on this!

fivetran-catfritz added 6 commits June 28, 2024 19:50

bug/renamed-columns

20658dc

add macro file

9b5db07

add casting

fb850fc

update yml

0a0317b

add casting for all

c012fd3

add casting for all

4a252f2

fivetran-catfritz self-assigned this Jul 1, 2024

fivetran-joemarkiewicz self-requested a review July 1, 2024 18:31

fivetran-joemarkiewicz reviewed Jul 1, 2024

View reviewed changes

fivetran-catfritz commented Jul 1, 2024

View reviewed changes

macro updates

f47bd73

fivetran-catfritz closed this Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug/renamed columns #46

Bug/renamed columns #46

fivetran-catfritz commented Jul 1, 2024

fivetran-joemarkiewicz left a comment

fivetran-joemarkiewicz Jul 1, 2024

fivetran-joemarkiewicz Jul 1, 2024 •

edited

Loading

fivetran-catfritz Jul 1, 2024

fivetran-joemarkiewicz Jul 1, 2024

fivetran-joemarkiewicz Jul 1, 2024

fivetran-joemarkiewicz Jul 1, 2024

fivetran-catfritz Jul 1, 2024

fivetran-joemarkiewicz Jul 1, 2024

fivetran-catfritz Jul 1, 2024

fivetran-catfritz left a comment

fivetran-catfritz Jul 1, 2024

fivetran-catfritz Jul 1, 2024

fivetran-catfritz Jul 1, 2024

Bug/renamed columns #46

Bug/renamed columns #46

Conversation

fivetran-catfritz commented Jul 1, 2024

PR Overview

PR Checklist

Basic Validation

Detailed Validation

If you had to summarize this PR in an emoji, which would it be?

fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-joemarkiewicz Jul 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-catfritz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-joemarkiewicz Jul 1, 2024 •

edited

Loading