From 0932277536a1c3bccece84cbccd909f00c8143dd Mon Sep 17 00:00:00 2001 From: Jennifer Berk Date: Mon, 14 Aug 2023 22:41:06 -0400 Subject: [PATCH 1/3] Fix revenue example indentation and defaults syntax on about-metricflow.md Fixes the "revenue example" semantic model code indentation and defaults syntax on About MetricFlow. Also adds a comment at the start of the second semantic model and makes it follow the same description-model-defaults order as the first semantic model, to make it easier for a new user to understand. --- website/docs/docs/build/about-metricflow.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/website/docs/docs/build/about-metricflow.md b/website/docs/docs/build/about-metricflow.md index 154e40d515d..02031bd0e55 100644 --- a/website/docs/docs/build/about-metricflow.md +++ b/website/docs/docs/build/about-metricflow.md @@ -147,18 +147,18 @@ semantic_models: type: time type_params: time_granularity: day - - name: customers - defaults: null - agg_time_dimension: first_ordered_at - description: > - Customer dimension table. The grain of the table is one row per - customer. - model: ref('customers') # The name of the dbt model and schema + - name: customers #The name of the second semantic model + description: > + Customer dimension table. The grain of the table is one row per + customer. + model: ref('customers') #The name of the dbt model and schema + defaults: + agg_time_dimension: first_ordered_at entities: #Entities. These usually correspond to keys in the table. - name: customer type: primary expr: customer_id - dimensions: + dimensions: #Dimensions,either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. - name: is_new_customer type: categorical expr: case when first_ordered_at is not null then true else false end From 72ff189eadf0d352b29b37e7662122eebe97b4f6 Mon Sep 17 00:00:00 2001 From: mirnawong1 <89008547+mirnawong1@users.noreply.github.com> Date: Tue, 15 Aug 2023 08:41:59 -0400 Subject: [PATCH 2/3] Update website/docs/docs/build/about-metricflow.md --- website/docs/docs/build/about-metricflow.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/build/about-metricflow.md b/website/docs/docs/build/about-metricflow.md index 02031bd0e55..ede185a5e39 100644 --- a/website/docs/docs/build/about-metricflow.md +++ b/website/docs/docs/build/about-metricflow.md @@ -158,7 +158,7 @@ semantic_models: - name: customer type: primary expr: customer_id - dimensions: #Dimensions,either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. + dimensions: # Dimensions are either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. - name: is_new_customer type: categorical expr: case when first_ordered_at is not null then true else false end From f5fc3e4e472e06e7c31791043dcb38fc1b854369 Mon Sep 17 00:00:00 2001 From: mirnawong1 <89008547+mirnawong1@users.noreply.github.com> Date: Tue, 15 Aug 2023 08:47:13 -0400 Subject: [PATCH 3/3] Update about-metricflow.md fix hashes --- website/docs/docs/build/about-metricflow.md | 32 ++++++++++----------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/website/docs/docs/build/about-metricflow.md b/website/docs/docs/build/about-metricflow.md index ede185a5e39..b1a22b9072c 100644 --- a/website/docs/docs/build/about-metricflow.md +++ b/website/docs/docs/build/about-metricflow.md @@ -70,7 +70,7 @@ MetricFlow supports different metric types: In the upcoming sections, we'll show how data practitioners currently calculate metrics and compare it to how MetricFlow makes defining metrics easier and more flexible. -The following example data is based off the Jaffle Shop repo. You can view the complete [dbt project](https://github.com/dbt-labs/jaffle-sl-template). The tables we're using in our example model are: +The following example data is based on the Jaffle Shop repo. You can view the complete [dbt project](https://github.com/dbt-labs/jaffle-sl-template). The tables we're using in our example model are: - `orders` is a production data platform export that has been cleaned up and organized for analytical consumption - `customers` is a partially denormalized table in this case with a column derived from the orders table through some upstream process @@ -91,7 +91,7 @@ Next, we'll compare how data practitioners currently calculate metrics with mult -The following example displays how data practitioners typically would calculate the order_total metric aggregated. It's also likely that analysts are asked for more details on a metric, like how much revenue came from new customers. +The following example displays how data practitioners typically would calculate the `order_total` metric aggregated. It's also likely that analysts are asked for more details on a metric, like how much revenue came from new customers. Using the following query creates a situation where multiple analysts working on the same data, each using their own query method — this can lead to confusion, inconsistencies, and a headache for data management. @@ -121,44 +121,44 @@ In the following three example tabs, use MetricFlow to define a semantic model t In this example, a measure named `order_total` is defined based on the order_total column in the `orders` table. -The time dimension `metric_time` provides daily granularity and can be aggregated to weekly or monthly time periods. Additionally, a categorical dimension called `is_new_customer` is specified in the `customers` semantic model. +The time dimension `metric_time` provides daily granularity and can be aggregated into weekly or monthly time periods. Additionally, a categorical dimension called `is_new_customer` is specified in the `customers` semantic model. ```yaml semantic_models: - - name: orders #The name of the semantic model + - name: orders # The name of the semantic model description: | - Model containing order data. The grain of the table is the order id. + A model containing order data. The grain of the table is the order id. model: ref('orders') #The name of the dbt model and schema defaults: agg_time_dimension: metric_time - entities: #Entities. These usually correspond to keys in the table.table. + entities: # Entities, which usually correspond to keys in the table. - name: order_id type: primary - name: customer type: foreign expr: customer_id - measures: #Measures. These are the aggregations on the columns in the table. + measures: # Measures, which are the aggregations on the columns in the table. - name: order_total agg: sum - dimensions: #Dimensions,either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. + dimensions: # Dimensions are either categorical or time. They add additional context to metrics and the typical querying pattern is Metric by Dimension. - name: metric_time expr: cast(ordered_at as date) type: time type_params: time_granularity: day - - name: customers #The name of the second semantic model + - name: customers # The name of the second semantic model description: > Customer dimension table. The grain of the table is one row per customer. model: ref('customers') #The name of the dbt model and schema defaults: agg_time_dimension: first_ordered_at - entities: #Entities. These usually correspond to keys in the table. + entities: # Entities, which usually correspond to keys in the table. - name: customer type: primary expr: customer_id - dimensions: # Dimensions are either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. + dimensions: # Dimensions are either categorical or time. They add additional context to metrics and the typical querying pattern is Metric by Dimension. - name: is_new_customer type: categorical expr: case when first_ordered_at is not null then true else false end @@ -178,20 +178,20 @@ Similarly, you could then add additional dimensions like `is_food_order` to your semantic_models: - name: orders description: | - Model containing order data. The grain of the table is the order id. + A model containing order data. The grain of the table is the order id. model: ref('orders') #The name of the dbt model and schema defaults: agg_time_dimension: metric_time - entities: #Entities. These usually correspond to keys in the table.table. + entities: # Entities, which usually correspond to keys in the table - name: order_id type: primary - name: customer type: foreign expr: customer_id - measures: #Measures. These are the aggregations on the columns in the table. + measures: # Measures, which are the aggregations on the columns in the table. - name: order_total agg: sum - dimensions: #Dimensions,either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension. + dimensions: # Dimensions are either categorical or time. They add additional context to metrics and the typical querying pattern is Metric by Dimension. - name: metric_time expr: cast(ordered_at as date) type: time @@ -265,7 +265,7 @@ metrics:
How does the Semantic Layer handle joins?
-
MetricFlow builds joins based on the types of keys and parameters that are passed to entities. To better understand how joins are constructed see our documentations on join types.

Rather than capturing arbitrary join logic, MetricFlow captures the types of each identifier and then helps the user to navigate to appropriate joins. This allows us to avoid the construction of fan out and chasm joins as well as generate legible SQL.
+
MetricFlow builds joins based on the types of keys and parameters that are passed to entities. To better understand how joins are constructed see our documentation on join types.

Rather than capturing arbitrary join logic, MetricFlow captures the types of each identifier and then helps the user to navigate to appropriate joins. This allows us to avoid the construction of fan out and chasm joins as well as generate legible SQL.