Docs: Data Quality & Observability (#16701)

open-metadata · Jun 18, 2024 · bcbc2c0 · bcbc2c0
1 parent e2f845b
commit bcbc2c0
Show file tree

Hide file tree

Showing 28 changed files with 381 additions and 362 deletions.
diff --git a/openmetadata-docs/content/partials/v1.4/releases/latest.md b/openmetadata-docs/content/partials/v1.4/releases/latest.md
@@ -1,7 +1,7 @@
 # 1.4.3 Release 🎉
 
 {% note noteType="Tip" %} 
-**June 15, 2024**
+**June 15th, 2024**
 {% /note %}
 
 {% inlineCalloutContainer %}

diff --git a/...-docs/content/v1.4.x/how-to-guides/data-observability/incident-manager/index.md b/...-docs/content/v1.4.x/how-to-guides/data-observability/incident-manager/index.md
diff --git a/openmetadata-docs/content/v1.4.x/how-to-guides/data-observability/index.md b/openmetadata-docs/content/v1.4.x/how-to-guides/data-observability/index.md
diff --git a/...rvability/incident-manager/IM-workflow.md → ...rvability/incident-manager/IM-workflow.md b/...rvability/incident-manager/IM-workflow.md → ...rvability/incident-manager/IM-workflow.md
@@ -1,9 +1,9 @@
 ---
-title: How to work with Incident Manager
-slug: /how-to-guides/data-observability/incident-manager/workflow
+title: How to work with the Incident Manager
+slug: /how-to-guides/data-quality-observability/incident-manager/workflow
 ---
 
-# How to Work with Incident Manager Workflow
+# How to Work with the Incident Manager Workflow
 
 ## 1. Incident Dashboard
 

diff --git a/...ntent/v1.4.x/how-to-guides/data-quality-observability/incident-manager/index.md b/...ntent/v1.4.x/how-to-guides/data-quality-observability/incident-manager/index.md
@@ -0,0 +1,105 @@
+---
+title: Incident Manager
+slug: /how-to-guides/data-quality-observability/incident-manager
+---
+
+# Overview of the Incident Manager
+
+Using Incident Manager, managing data quality issues becomes streamlined and efficient. By centralizing the resolution process, assigning tasks, and logging root causes, your team can quickly address and resolve failures. The historical record of past incidents serves as a comprehensive guide, aiding your team in troubleshooting and resolving issues more effectively. All the necessary context is readily available, making it easier to maintain high data quality standards.
+
+## Opening and Triaging Incidents
+ In v1.1.0, we introduced the ability for user to manage and triage incidents linked to failures. When a test case fails, it will automatically open a new incident and mark it as new. If enough information is available, OpenMetadata will automatically assign a severity to the incident; note that you can override this severity. It indicates that a new failure has happened.
+
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/resolution-workflow-new.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+
+The Incident Manager serves as a centralized hub to handle the resolution flow of failed Data Quality Tests. Once an incident has been open you will be able to triage and manage it. You can perform different actions at this stage:
+
+- **Acknowledge the Issue:** Recognize and confirm that there is a problem that needs attention. By marking with `ack` you can inform users that people are aware of the ongoing incident.
+- **Assign Responsibility:** Designate a specific person or team to address the errors. By marking with `assign` you can open a task for the assignee.
+- **Log the Root Cause:** Document the underlying cause of the failure for future reference and analysis.
+
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/resolution-workflow-ack-form.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/resolution-workflow-ack.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+
+You can mark the incident as `resolved`. The user will be required to specify the reason and add a comment. This provides context regarding the incident and helps users further understand what might have gone wrong
+
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/resolution-workflow-resolved-form.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/resolution-workflow-resolved.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+
+## Using the Test Resolution Flow
+
+The Test Resolution flow is a critical feature of the Incident Manager. Here’s how it works:
+
+1. **Failure Notification:** When a Data Quality Test fails, the system generates a notification.
+2. **Acknowledge the Failure:** The designated user acknowledges the issue within the Incident Manager.
+3. **Assignment:** The issue is then assigned to a knowledgeable user or team responsible for resolving it.
+4. **Status Updates:** The assigned user can update the status of the issue, keeping the organization informed about progress and any developments.
+5. **Sharing Updates:** All impacted users receive updates, ensuring everyone stays informed about the resolution process.
+
+## Incidents Context & History
+
+When clicking on an open incident you will different information:
+**Open Incident:** this section will show you open incidents with the timeline and any comments/collaboration that might have been happening.
+**Closed Incidents:** this section will show you incidents that have been resolved in the past with the timeline and any comments/collaboration that might have been happening and the resolution reason.
+
+{% image
+  src="/images/v1.4/features/ingestion/workflows/data-quality/incident-management-page.png"
+  alt="Test suite results table"
+  caption="Test suite results table"
+ /%}
+
+## Building a Troubleshooting Handbook
+
+One of the powerful features of the Incident Manager is its ability to store all past failures. This historical data becomes a valuable troubleshooting handbook for your team. Here's how you can leverage it:
+
+- **Explore Similar Scenarios:** Review previous incidents to understand how similar issues were resolved.
+- **Contextual Information:** Access all necessary context directly within OpenMetadata, including previous resolutions, root causes, and responsible teams.
+- **Continuous Improvement:** Use historical data to improve data quality tests and prevent future failures.
+
+
+## Steps to Get Started
+
+1. **Access the Incident Manager:** Navigate to the Incident Manager within the OpenMetadata platform.
+2. **Monitor Data Quality Tests:** Keep an eye on your data quality tests to quickly identify any failures.
+3. **Acknowledge and Assign:** Acknowledge any issues promptly and assign them to the appropriate team members.
+4. **Log and Learn:** Document the root cause of each failure and use the stored information to learn and improve.
+
+By following these steps, you'll ensure that your organization effectively manages data quality issues, maintains high standards, and continuously improves its data quality processes.
+
+{%inlineCalloutContainer%}
+ {%inlineCallout
+  color="violet-70"
+  bold="How to work with the Incident Manager"
+  icon="MdManageSearch"
+  href="/how-to-guides/data-quality-observability/incident-manager/workflow"%}
+  Set up the Incident Manager workflow.
+ {%/inlineCallout%}
+ {%inlineCallout
+  color="violet-70"
+  bold="Root Cause Analysis (Collate)"
+  icon="MdFactCheck"
+  href="/how-to-guides/data-quality-observability/incident-manager/root-cause-analysis"%}
+  Understand the nature of the failure and take corrective actions.
+ {%/inlineCallout%}
+{%/inlineCalloutContainer%}
diff --git a/...ility/data-quality/root-cause-analysis.md → ...y/incident-manager/root-cause-analysis.md b/...ility/data-quality/root-cause-analysis.md → ...y/incident-manager/root-cause-analysis.md
@@ -1,6 +1,6 @@
 ---
 title: Root Cause Analysis
-slug: /quality-and-observability/data-quality/root-cause-analysis
+slug: /how-to-guides/data-quality-observability/incident-manager/root-cause-analysis
 ---
 
 # Root Cause Analysis

diff --git a/openmetadata-docs/content/v1.4.x/how-to-guides/data-quality-observability/index.md b/openmetadata-docs/content/v1.4.x/how-to-guides/data-quality-observability/index.md
@@ -0,0 +1,38 @@
+---
+title: Data Quality and Observability
+slug: /how-to-guides/data-quality-observability
+---
+
+# Data Quality and Observability
+
+OpenMetadata offers a simple and easy-to-use solution for quality and observability. With no code tests, observability metrics, incident management, and root cause analysis (Collate feature), you have a unified solution for discovery, governance, and observability.
+
+OpenMetadata ensures the health and performance of your data systems by providing comprehensive data observability features. These features offer insights into the state of test cases, helping to detect, diagnose, and resolve data issues quickly. By monitoring data flows and data quality in real-time, data teams can ensure that data remains reliable and trustworthy. OpenMetadata supports [observability alerts and notifications](/how-to-guides/admin-guide/alerts) to help you maintain the integrity and performance of your data systems.
+
+{%inlineCalloutContainer%}
+ {%inlineCallout
+    icon="MdGppGood"
+    bold="Data Quality"
+    href="/how-to-guides/data-quality-observability/quality"%}
+    Deep dive into how to set up quality tests, alert and triage and resolve incidents!
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdVisibility"
+    bold="Data Profiler"
+    href="/how-to-guides/data-quality-observability/profiler"%}
+    Deep dive into how to set up the profiler in OpenMetadata.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdAddAlert"
+    bold="Observability Alerts"
+    href="/how-to-guides/data-quality-observability/observability/alerts"%}
+    Set up observability alerts in OpenMetadata.
+ {%/inlineCallout%}
+ {%inlineCallout
+  color="violet-70"
+  bold="Incident Manager"
+  icon="MdMenuBook"
+  href="/how-to-guides/data-quality-observability/incident-manager"%}
+  Set up incident management in OpenMetadata.
+ {%/inlineCallout%}
+{%/inlineCalloutContainer%}
diff --git a/...-and-observability/data-quality/alerts.md → ...ity-observability/observability/alerts.md b/...-and-observability/data-quality/alerts.md → ...ity-observability/observability/alerts.md
@@ -1,9 +1,9 @@
 ---
-title: Alerts
-slug: /quality-and-observability/data-quality/alerts
+title: Observability Alerts
+slug: /how-to-guides/data-quality-observability/observability/alerts
 ---
 
-# Alerts
+# Observability Alerts
 OpenMetadata provides a native way to get alerted in case of test case failure allowing you to proactively resolve data incidents
 
 ## Setting Up Alerts
@@ -47,7 +47,6 @@ Trigger section will allow you set the condition for which an alert should be tr
   caption="Alerts Menu"
  /%}
 
-
 ### Step 4 - Select a Destination
 In the destination section you will be able to select between `internal` and `external` destination:
 - `internal`: allow you to select the destination as an internal user, team or admin. The subscription set to this user, team or admin will be use to dispatch the alert

diff --git a/.../content/v1.4.x/how-to-guides/data-quality-observability/observability/index.md b/.../content/v1.4.x/how-to-guides/data-quality-observability/observability/index.md
@@ -0,0 +1,25 @@
+---
+title: Data Observability
+slug: /how-to-guides/data-quality-observability/observability
+---
+
+# Data Observability
+
+OpenMetadata has been providing observability alerts right from the start to notify users of important data lifecycle events: schema modifications, ownership shifts, and tagging updates. Users can define fine-grained alerts and notifications.
+
+Starting from the 1.3 release, Data Observability alerts have been completely revamped, simplifying the process of monitoring data. Users can quickly create alerts for:
+- **Changes in the Metadata:** such as schema changes,
+- **Data Quality Failures:** to filter by Test Suite,
+- **Pipeline Status Failures:** when ingesting runs from your ETL systems, and
+- **Ingestion Pipeline Monitoring:** for OpenMetadata’s ingestion workflows
+
+Depending on your use cases, notifications can be sent to owners, admins, teams, or users, providing a more personalized and informed experience. Teams can configure their dedicated Slack, MS Teams, or Google Chat channels to receive notifications related to their data assets, streamlining communication and collaboration. With the alerts and notifications in OpenMetadata, users can send Announcements over email, Slack, or Teams. Alerts are sent to a user when they are mentioned in a task or an activity feed.
+
+{% youtube videoId="qc-3sZ_eU5Y" start="0:00" end="2:04" width="560px" height="315px" /%}
+
+{%inlineCallout
+    icon="MdAddAlert"
+    bold="Observability Alerts"
+    href="/how-to-guides/data-quality-observability/observability/alerts"%}
+    Set up observability alerts in OpenMetadata.
+{%/inlineCallout%}
diff --git a/...bservability/profiler/auto-pii-tagging.md → ...bservability/profiler/auto-pii-tagging.md b/...bservability/profiler/auto-pii-tagging.md → ...bservability/profiler/auto-pii-tagging.md
@@ -1,6 +1,6 @@
 ---
 title: Auto PII Tagging
-slug: /quality-and-observability/profiler/auto-pii-tagging
+slug: /how-to-guides/data-quality-observability/profiler/auto-pii-tagging
 ---
 
 # Auto PII Tagging

diff --git a/...servability/profiler/external_workflow.md → ...servability/profiler/external_workflow.md b/...servability/profiler/external_workflow.md → ...servability/profiler/external_workflow.md
@@ -1,6 +1,6 @@
 ---
 title: External Profiler Workflow
-slug: /quality-and-observability/profiler/external-workflow
+slug: /how-to-guides/data-quality-observability/profiler/external-workflow
 ---
 
 # External Profiler Workflow
@@ -23,7 +23,7 @@ You might also want to check out how to configure external sample data. You can
 {% tile
 title="External Sample Data"
 description="Configure OpenMetadata to store sample data in an external storage such as S3"
-link="/connectors/ingestion/workflows/profiler/external-sample-data"
+link="/how-to-guides/data-quality-observability/profiler/external-sample-data"
 / %}
 {% /tilesContainer %}
 

diff --git a/...-docs/content/v1.4.x/how-to-guides/data-quality-observability/profiler/index.md b/...-docs/content/v1.4.x/how-to-guides/data-quality-observability/profiler/index.md
@@ -0,0 +1,52 @@
+---
+title: Data Profiler
+slug: /how-to-guides/data-quality-observability/profiler
+---
+
+# Overview of Data Profiler
+
+The profiler in OpenMetadata helps to understand the shape of your data and to quickly validate assumptions. The data profiler helps to capture table usage statistics over a period of time. This happens as part of profiler ingestion. Data profiles enable you to check for null values in non-null columns, for duplicates in a unique column, etc. You can gain a better understanding of column data distributions through the descriptive statistics provided.
+
+Watch the video to understand OpenMetadata’s native Data Profiler and Data Quality tests.
+
+{%  youtube videoId="gLdTOF81YpI" start="0:00" end="1:08:10" width="560px" height="315px" /%}
+
+{%inlineCalloutContainer%}
+ {%inlineCallout
+  color="violet-70"
+  bold="Profiler Tab"
+  icon="MdSecurity"
+  href="/how-to-guides/data-quality-observability/profiler/tab"%}
+  Get a complete picture of the Table Profile and Column Profile details.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdVisibility"
+    bold="Profiler Workflow"
+    href="/how-to-guides/data-quality-observability/profiler/workflow"%}
+    Configure and run the Profiler Workflow to extract Profiler data.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdAnalytics"
+    bold="Metrics"
+    href="/how-to-guides/data-quality-observability/profiler/metrics"%}
+    Learn about the supported profiler metrics.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdViewCompact"
+    bold="Sample Data"
+    href="/how-to-guides/data-quality-observability/profiler/external-sample-data"%}
+    Learn about the external storage for sample data.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdOutlineSchema"
+    bold="External Workflow"
+    href="/how-to-guides/data-quality-observability/profiler/external-workflow"%}
+    Run a single workflow profiler for the entire source externally.
+ {%/inlineCallout%}
+ {%inlineCallout
+    icon="MdOutlinePersonPin"
+    bold="Auto PII Tagging"
+    href="/how-to-guides/data-quality-observability/profiler/auto-pii-tagging"%}
+    Auto tag data as PII Sensitive/NonSensitive at the column level.
+ {%/inlineCallout%}
+{%/inlineCalloutContainer%}
diff --git a/...ity-and-observability/profiler/metrics.md → ...quality-observability/profiler/metrics.md b/...ity-and-observability/profiler/metrics.md → ...quality-observability/profiler/metrics.md
@@ -1,9 +1,9 @@
 ---
 title: Metrics
-slug: /quality-and-observability/profiler/metrics
+slug: /how-to-guides/data-quality-observability/profiler/metrics
 ---
 
-# Metrics
+# Profiler Metrics
 
 Here you can find information about the supported metrics for the different types.
 
@@ -175,4 +175,4 @@ OpenMetadata will look at the previous day to fetch the operations that were per
 
 ## Reach out!
 
-Is there any metric you'd like to see? Open an [issue](https://github.com/open-metadata/OpenMetadata/issues/new/choose) or reach out on [Slack](https://slack.open-metadata.org).
+Is there any metric you'd like to see? Open an [issue](https://github.com/open-metadata/OpenMetadata/issues/new/choose) or reach out on [Slack](https://slack.open-metadata.org).