Skip to content

Non Historized Link

Tim Kirschke edited this page Oct 6, 2023 · 2 revisions

The "Datavault4Coalesce: Non-Historized Link" node creates a non-historized link entity, connecting two or more entities, or an entity with itself.

User Configuration

Config Name Config Type Explanation
Link Hashkey Column string In the Config section on the right there is a tab called 'Data Vault'. Expand this and you get a column selector list 'Link Hashkey Column'. Select the hashkey of this Link.
Source is Single Batch? Boolean In the Config section on the right there is a tab called 'Data Vault'. Expand this and you get a toggle 'Source is Single Batch?'. Activate this if your source data always only hold one load_date. By activating this toggle, a QUALIFY() statement is disabled, which will then increase the loading performance. But activate only, when you can ensure your data holds only one load date!

System Columns

No Sys Columns are added inside a Link Node.

Example

In this example the Non-Historized Link 'LINEITEM_NL' is being created. The source table is 'STG_LINEITEM'.

LINEITEM_Model_Stage_Highlighted

Create the Stage Node:

Create multiple Hash Columns, by selecting the following columns, and per selection, do 'right-click'->'Generate Hash Column'. Rename the output accordingly.

  • HK_LINEITEM_L:
    • L_ORDERKEY
    • L_PARTKEY
    • L_SUPPKEY
    • L_LINENUMBER
  • HK_ORDER_H:
    • L_ORDERKEY
  • HK_PART_H:
    • L_PARTKEY
  • HK_SUPPLIER_H:
    • L_SUPPKEY

Additionally, rename the column L_LINEITEM to DCK_LINEITEM. DCK stands for Dependant Child Key. The Lineitem number is required to uniquely identify a row in the source object LINEITEM. That's why it is also part of the Link Hashkey calculation.

This is how your Stage Node should look like (not all columns are shown):

STG_LINEITEM_cols

Create the node:

  • Within your Node STG_LINEITEM, multi-select the following columns:
    • HK_LINEITEM_L
    • HK_ORDER_H
    • HK_PART_H
    • HK_SUPPLIER_H
    • DCK_LINENUMBER
    • LDTS
    • RSRC
    • L_QUANTITY
    • L_EXTENDEDPRICE
    • L_DISCOUNT
    • L_TAX
    • L_RETURNFLAG
    • L_LINESTATUS
  • Then do 'right-click' -> 'Add Node' -> 'Datavault by Scalefree: Non-Historized Link'
  • Rename the new Node to LINEITEM_NL
  • Rearrange the columns in the new Node, so that the Link Hashkey is at the top, followed by the Hub Hashkeys, the Dependant Child Key, LDTS and RSRC:

Link_neu_example_2



  • In the Config section on the right there is a tab called 'Data Vault'
  • Expand this and you get a column selector 'Link Hashkey Column'
    • Select the hashkey of this Link, HK_LINEITEM_L.

Link_neu_example_4



  • Create and run the node


Generated SQL Code of Example

CREATE OR REPLACE TABLE
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL" (
    "HK_LINEITEM_L" STRING,
    "HK_ORDER_H" STRING,
    "HK_PART_H" STRING,
    "HK_SUPPLIER_H" STRING,
    "DCK_LINENUMBER" NUMBER (38, 0) NOT NULL,
    "LDTS" TIMESTAMP
    COMMENT 'The Load Date Timestamp (LDTS) describes when this data first arrived in the Data Warehouse.',
    "RSRC" STRING
    COMMENT 'The Record Source (RSRC) describes the source of this data.',
    "L_QUANTITY" NUMBER (12, 2) NOT NULL,
    "L_EXTENDEDPRICE" NUMBER (12, 2) NOT NULL,
    "L_DISCOUNT" NUMBER (12, 2) NOT NULL,
    "L_TAX" NUMBER (12, 2) NOT NULL,
    "L_RETURNFLAG" VARCHAR (1) NOT NULL,
    "L_LINESTATUS" VARCHAR (1) NOT NULL
)
COMMENT = 'Lineitem data as defined by TPC-H'

    INSERT INTO
"COALESCE_WORKSHOP"."CORE"."LINEITEM_NL"
WITH
incoming AS (
    SELECT DISTINCT
    "STG_LINEITEM"."HK_LINEITEM_L" AS "HK_LINEITEM_L",
    "STG_LINEITEM"."HK_ORDER_H" AS "HK_ORDER_H",
    "STG_LINEITEM"."HK_PART_H" AS "HK_PART_H",
    "STG_LINEITEM"."HK_SUPPLIER_H" AS "HK_SUPPLIER_H",
    "STG_LINEITEM"."DCK_LINENUMBER" AS "DCK_LINENUMBER",
    "STG_LINEITEM"."LDTS" AS "LDTS",
    "STG_LINEITEM"."RSRC" AS "RSRC",
    "STG_LINEITEM"."L_QUANTITY" AS "L_QUANTITY",
    "STG_LINEITEM"."L_EXTENDEDPRICE" AS "L_EXTENDEDPRICE",
    "STG_LINEITEM"."L_DISCOUNT" AS "L_DISCOUNT",
    "STG_LINEITEM"."L_TAX" AS "L_TAX",
    "STG_LINEITEM"."L_RETURNFLAG" AS "L_RETURNFLAG",
    "STG_LINEITEM"."L_LINESTATUS" AS "L_LINESTATUS"
    FROM
    "COALESCE_WORKSHOP"."CORE"."STG_LINEITEM" "STG_LINEITEM"
),
new_records AS (
    SELECT
    "SRC".*
    FROM
    incoming "SRC"
    WHERE
    NOT EXISTS (
        SELECT
        1
        FROM
        "COALESCE_WORKSHOP"."CORE"."LINEITEM_NL" "TGT"
        WHERE
        "SRC"."HK_LINEITEM_L" = "TGT"."HK_LINEITEM_L"
    )
    AND "SRC"."LDTS" > (
        SELECT
        COALESCE(
            MAX("LDTS"),
            TO_TIMESTAMP('0001-01-01T00:00:01', 'YYYY-MM-DDTHH24:MI:SS')
        )
        FROM
        "COALESCE_WORKSHOP"."CORE"."LINEITEM_NL"
        WHERE
        "LDTS" != TO_TIMESTAMP('8888-12-31T23:59:59', 'YYYY-MM-DDTHH24:MI:SS')
    ) QUALIFY ROW_NUMBER() OVER (
        PARTITION BY
        "HK_LINEITEM_L"
        ORDER BY
        "LDTS"
    ) = 1
)
SELECT
*
FROM
new_records