-
Notifications
You must be signed in to change notification settings - Fork 6
First Tutorial Part 9: Shadow Realized Views
Previous | Main | Next |
---|---|---|
<-- Part 8 | Main | Part 10 --> |
Last but not least, we always have to think about performance. As you may have noted, we are joining the 2 views together and we have an index on Type for the Form table. So things should be OK. However, we realized Test_XYZ_PivotView but not Test_View nor Test_XYZ_Analytics_View.
Let's do that for completeness here... Both Test_XYZ_PivotView and Test_View are fairly fundamental datasets upon which a lot of analysis could be performed, so realizing those 2 makes sense. As for Test_XYZ_Analytics_View, it's where we have interesting knowledge now and therefore, likely to be a data source for other analytics processes (a tableau report, exporting the data to a CSV for offline analysis in an environment like SAS or R maybe...). The Tilda realize segments are:
--------------------------------------------------------------------------------------------------
-- Test_XYZ_PivotView
--------------------------------------------------------------------------------------------------
,"realize":{
"primary":{ "columns":["formRefnum"] }
,"foreign":[
{ "name":"Form" , "srcColumns":["formRefnum" ], "destObject": "Form" }
,{ "name":"User" , "srcColumns":["formUserRefnum"], "destObject": "User" }
]
,"indices":[
{ "name":"User", "columns":["formUserRefnum"], "orderBy":["formFillDate desc"] }
,{ "name":"Type", "columns":["formType" ], "orderBy":["formFillDate desc"] }
]
}
--------------------------------------------------------------------------------------------------
-- Test_View
--------------------------------------------------------------------------------------------------
,"realize":{
"primary":{ "columns":["type"] }
}
--------------------------------------------------------------------------------------------------
-- Test_XYZ_Analytics_View
--------------------------------------------------------------------------------------------------
,"realize":{
"primary":{ "columns":["formRefnum"] }
,"foreign":[
{ "name":"Form" , "srcColumns":["formRefnum" ], "destObject": "Form" }
,{ "name":"User" , "srcColumns":["formUserRefnum"], "destObject": "User" }
]
,"indices":[
{ "name":"User", "columns":["formUserRefnum"], "orderBy":["formFillDate desc"] }
,{ "name":"Type", "columns":["formType" ], "orderBy":["formFillDate desc"] }
]
}
Test_XYZ_PivotView and Test_XYZ_Analytics_View are similar in that they share the same grain (formRefnum) and similar columns (userRefnum...). As such, their realize sections are pretty much identical. As for Test_View, it's a fairly simple construct (akin to a Fact table in a data warehouse) where we accumulated some measures per form type, so the realize section is trivial.
As discussed previously, something interesting happens when you layer multiple views that are realized: a parallel hierarchy is created of equivalent views that go against their realized dependencies. Since Test_XYZ_Analytics_View is (1) realized and (2) reuses other realized views, a _R parallel view will be generated:
create or replace view TILDATMP.TILDATEST_Test_XYZ_Analytics_View_R as
select /*DoFormulasSuperView*/
"formRefnum" -- COLUMN
, "formType" -- COLUMN
, "formFillDateTZ" -- COLUMN
, "formFillDate" -- COLUMN
, "formUserRefnum" -- COLUMN
, "formUserEmail" -- COLUMN
, "formCountCorrect" -- COLUMN
, "formTimeMillisTotal" -- COLUMN
, "testCount" -- COLUMN
, "testAnswerCountCorrect" -- COLUMN
, "testTimeMillisAvg" -- COLUMN
-- Whether the test was passed or not by answering at least 2 out of the 3 questions.
, (coalesce("formCountCorrect", 0) >= 2)::integer as "isPassed"
-- Whether the test took longer that the average time spent across all tests.
, (coalesce("formTimeMillisTotal", 0) > coalesce("testTimeMillisAvg", 0))::integer as "tookLongerThanAverage"
-- Whether the test was challenging in that:<LI>
-- <LI>it was passed,</LI>
-- <LI>and overall, less than a third of answers across all tests were answered correctly.</LI>
-- </UL>
, ((coalesce("formCountCorrect", 0) >= 2)::integer=1 AND coalesce("testAnswerCountCorrect", 0) < coalesce("testCount", 0)*3/2)::integer as "wasChallenging"
from (
-- 'A view of XYZ forms with analytics'
select TILDATEST.Test_XYZ_Realized."formRefnum" as "formRefnum" -- The primary key for this record
, TILDATEST.Test_XYZ_Realized."formType" as "formType" -- Form template type
, TILDATEST.Test_XYZ_Realized."formFillDateTZ" as "formFillDateTZ" -- Generated helper column to hold the time zone ID for 'formFillDate'.
, TILDATEST.Test_XYZ_Realized."formFillDate" as "formFillDate" -- The date the form was filled
, TILDATEST.Test_XYZ_Realized."formUserRefnum" as "formUserRefnum" -- The primary key for this record
, TILDATEST.Test_XYZ_Realized."formUserEmail" as "formUserEmail" -- The user's email
, TILDATEST.Test_XYZ_Realized."countCorrect" as "formCountCorrect" -- Whether the answer is correct or not
, TILDATEST.Test_XYZ_Realized."timeMillisTotal" as "formTimeMillisTotal" -- Time in milliseconds for the time spent answering the question
, TILDATEST.Test_Realized."testCount" as "testCount" -- The primary key for this record
, TILDATEST.Test_Realized."answerCountCorrect" as "testAnswerCountCorrect" -- Whether the answer is correct or not
, TILDATEST.Test_Realized."timeMillisAvg" as "testTimeMillisAvg" -- Time in milliseconds for the time spent answering the question
from TILDATEST.Test_XYZ_Realized
inner join TILDATEST.Test_Realized on Test_Realized."type" = Test_XYZ_Realized."formType"
) as T
-- Realized as /*genRealizedColumnList*/"formRefnum" -- COLUMN ,"formType" -- COLUMN ,"formFillDateTZ" -- COLUMN ,"formFillDate" -- COLUMN ,"formUserRefnum" -- COLUMN ,"formUserEmail" -- COLUMN ,"formCountCorrect" -- COLUMN ,"formTimeMillisTotal" -- COLUMN ,"testCount" -- COLUMN ,"testAnswerCountCorrect" -- COLUMN ,"testTimeMillisAvg" -- COLUMN ,"isPassed" -- FORMULA ,"tookLongerThanAverage" -- FORMULA ,"wasChallenging" -- FORMULA
;
It is somewhat similar to Test_XYZ_Analytics_View in terms of the columns defined, but the structure is quite different:
- The view is generated in the TILDATMP schema and should never be used directly.
- Instead of joining and using the sub-views Test_XYZ_PivotView and Test_View, it uses the realized tables Test_XYZ_Realized and Test_Realized.
- A sub-query is created to wrap what would have been the original selects from the views, and joins.
- Because all the columns have been pre-computed in the realized tables, the SQL fragments for aggregates, pivoting and formulas have disappeared inside the sub-query in favor of reusing the columns from the realized tables directly.
The expectation is that performance is much improved. To Refill the lot, you'd execute in order:
select Refill_Test_XYZ_Realized();
select Refill_Test_Realized();
select TILDATEST.Refill_Test_XYZ_Analytics_Realized();
The order is important in the sense that Test_XYZ and Test must be refilled first before Refill_Test_XYZ_Analytics is refilled. The HTML documentation for each schema will output a dependency graph and an optimal refill order for whatever views you have defined.
Previous | Main | Next |
---|---|---|
<-- Part 8 | Main | Part 10 --> |