Skip to content

Commit

Permalink
Minor updates on survey framework
Browse files Browse the repository at this point in the history
Signed-off-by: Mandy Chessell <mandy.e.chessell@gmail.com>
  • Loading branch information
mandy-chessell committed Mar 8, 2024
1 parent 138ae3b commit f2f2812
Show file tree
Hide file tree
Showing 7 changed files with 26 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ Figure 4 shows the structure of the survey report. The annotations are labelled
![Figure 4](apache-atlas-survey-action-service-analysis.svg)
> **Figure 4:** Analysis stages performed by the survey action service
### Data Source Measurements Annotation
### Resource Measurements Annotation

The data source measurements annotation is created in the *Measure Resource* analysis step. It sets up the following properties in the *dataSourceProperties* map:
The resource measurements annotation is created in the *Measure Resource* analysis step. It sets up the following properties in the *dataSourceProperties* map:

* entityInstanceCount - number of active entity instances
* entityInstanceCount:*typeName* - number of active entity instance of this type
Expand Down Expand Up @@ -104,22 +104,22 @@ All the graph vertices are linked to a [*GraphSchemaType*](/types/5/0533-Graph-S
> **Figure 5:** Linkage of graph schema elements based on Apache Atlas type.

### Data Profile Annotation
### Resource Profile Annotation

This survey action service attaches multiple data profile annotations to each graph schema attribute depending on their category (entity, relationship, classification or business metadata).
This survey action service attaches multiple resource profile annotations to each graph schema attribute depending on their category (entity, relationship, classification or business metadata).

![Figure 6](apache-atlas-survey-action-service-profile.svg)
> **Figure 6:** Details of the data profile annotations attached to each type of data field
> **Figure 6:** Details of the resource profile annotations attached to each type of data field
It sets up the following fields in each data profile annotation:
It sets up the following fields in each resource profile annotation:

* *analysisStep* - this is always set to *Profile Data*.
* *analysisStep* - this is always set to *Profile Resource*.
* *annotationType* - this identifies the type of values that the annotation contains.
* *explanation* - this provides more information about the annotation type.
* *valueCount* - this is a map of typeName to count. For example, if this annotation was counting the classifications attached to the *DataSet* entity type, then the map would include an entry for each type of classification attached to this type of entity and a count of how many times it is used.
* *additionalProperties* - contains the count of instances for the particular type that the data field represents.

The table summarizes the values in each of the data profile annotations depending on the category of the data field it is attached to.
The table summarizes the values in each of the resource profile annotations depending on the category of the data field it is attached to.

| Atlas Type Category | Annotation Type | Explanation | Value Count | Instance count in AdditionalProperties |
|---------------------|------------------------------------------------|-----------------------------------------------------------------------------------------------------------|--------------------------------------|-------------------------------------------|
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!-- SPDX-License-Identifier: CC-BY-4.0 -->
<!-- Copyright Contributors to the ODPi Egeria project. -->

An *survey-action service* is a component that performs analysis of the contents of a [digital resource](/concepts/digital-resource) on request. The aim of the survey action service is to enable a detailed picture of the properties of a resource to be built up.
An *survey action service* is a component that performs analysis of the contents of a [digital resource](/concepts/digital-resource) on request. The aim of the survey action service is to enable a detailed picture of the properties of a resource to be built up.

Each time a survey action service runs, it creates a new [survey report](/concepts/survey-report) linked off of the digital resource's [Asset](/concepts/asset) metadata element that records the results of the analysis.

Expand All @@ -18,9 +18,9 @@ An survey action service is designed to run at regular intervals to gather a det
??? info "Runtime for an survey action service"
Survey action services are packaged into [Survey Action Engines](/concepts/survey-action-engine) that run in the [Survey Action OMES](/services/omes/survey-action/overview) hosted in an [Engine Host](/concepts/engine-host).

The metadata repository interface for metadata discovery tools is implemented by the [Stewardship Actions OMAS](/services/omas/stewardship-action/overview) that runs in a [Metadata Access Server](/concepts/metadata-access-server).
The metadata repository interface for metadata discovery tools is implemented by the [Asset Owner OMAS](/services/omas/asset-owner/overview) that runs in a [Metadata Access Server](/concepts/metadata-access-server).

An survey action service may be triggered via an [Engine Action](/concepts/engine-action) or as part of a [governance action process](/concepts/governance-action-process).
A survey action service may be triggered via an [Engine Action](/concepts/engine-action), a [governance action type](/concepts/overnance-action-type) or as part of a [governance action process](/concepts/governance-action-process).

![Survey Action Service](/connectors/survey-action/survey-action-service.svg)

Expand Down
11 changes: 3 additions & 8 deletions site/docs/features/discovery-and-stewardship/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,20 +37,15 @@ The annotations for the data fields are linked off of the schema attributes crea

![Survey report structure](/frameworks/saf/survey-report-structure.svg)

## Discovery actions
## Survey actions

Open discovery can be used for the following types of analysis.
Survey actions can be used for the following types of analysis.

### Schema extraction

For digital resources that include structured data, *schema extraction* documents the data fields present in the digital resource and if the schema is attached to the asset, it will attempt to match the data fields it finds to its schema attributes.

Schema extraction uses the [schema analysis annotation](/types/6/0615-Schema-Extraction). It is linked directly off of the survey report.

The schema of the data in the digital resource is defined in a *SchemaType* linked from the digital resource's asset using the *AssetSchemaType* relationship. This may be established before the open discovery service runs, or may be derived by an [engine action](/concepts/engine-action) once the open discovery service has run.

* The *SchemaTypeDefinition* links the schema analysis annotation to the top level schema type.
* The *SchemaAttributeDefinition* links a data field to is corresponding schema attribute.
Schema extraction uses the [schema analysis annotation](/types/6/0615-Schema-Extraction). It is linked directly off of the survey report. The schema of the data in the digital resource is defined in a [*SchemaType*](/types/5/0501-Schema-Elements) linked from the digital resource's asset using the [*AssetSchemaType*](/types/5/0503-Asset-Schema) relationship. This may be established before the survey action service runs, or may be derived by the survey action service itself.

### Resource profiling

Expand Down
8 changes: 4 additions & 4 deletions site/docs/frameworks/saf/annotation-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ The annotation types defined in the [Survey Action Framework (SAF)](/frameworks/

* [Classification Annotation](/features/discovery-and-stewardship/overview/#classification-discovery) - Captures a recommendation of which classifications to attach to this asset. It can be made at the asset or data field level.
* [Data Class Annotation](/features/discovery-and-stewardship/overview/#data-class-discovery) - Captures a recommendation of which data class this data field closely represents.
* [Data Profile Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the characteristics of the data values stored in a specific data field in a data source.
* [Data Profile Log Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the named of the log files where profile characteristics of the data values stored in a specific data field. This is used when the profile results are too large to store in open metadata.
* [Data Source Measurement Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - collect arbitrary properties about a digital resource.
* [Data Source Physical Status Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - documents the physical characteristics of a data source asset.
* [Resource Profile Annotation](/features/discovery-and-stewardship/overview/#resource-profiling) - Capture the characteristics of the data values stored in a specific data field in a data source.
* [Resource Profile Log Annotation](/features/discovery-and-stewardship/overview/#resource-profiling) - Capture the named of the log files where profile characteristics of the data values stored in a specific data field. This is used when the profile results are too large to store in open metadata.
* [Resource Measurement Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - collect arbitrary properties about a digital resource.
* [Resource Physical Status Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - documents the physical characteristics of a data source asset.
* [Fingerprint Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the characteristics of the data values stored in a specific data field or the whole digital resource and express it as a single value.
* [Request for Action Annotation](/features/discovery-and-stewardship/overview/#requesting-stewardship-action) - used to trigger governance and stewardship actions.
* [Relationship Advice Annotation](/features/discovery-and-stewardship/overview/#relationship-discovery) - document a recommended relationship that should be established with the asset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ The annotation types defined in the [Open Discovery Framework (ODF)](/frameworks

* [Classification Annotation](/features/discovery-and-stewardship/overview/#classification-discovery) - Captures a recommendation of which classifications to attach to this asset. It can be made at the asset or data field level.
* [Data Class Annotation](/features/discovery-and-stewardship/overview/#data-class-discovery) - Captures a recommendation of which data class this data field closely represents.
* [Data Profile Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the characteristics of the data values stored in a specific data field in a data source.
* [Data Profile Log Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the named of the log files where profile characteristics of the data values stored in a specific data field. This is used when the profile results are too large to store in open metadata.
* [Data Source Measurement Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - collect arbitrary properties about a digital resource.
* [Data Source Physical Status Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - documents the physical characteristics of a data source asset.
* [Fingerprint Annotation](/features/discovery-and-stewardship/overview/#data-profiling) - Capture the characteristics of the data values stored in a specific data field or the whole digital resource and express it as a single value.
* [Resource Profile Annotation](/features/discovery-and-stewardship/overview/#resource-profiling) - Capture the characteristics of an aspect of the resource.
* [Resource Profile Log Annotation](/features/discovery-and-stewardship/overview/#resource-profiling) - Capture the name of the log files where profile characteristics of the resource are stored. This is used when the profile results are too large to store in open metadata.
* [Resource Measurement Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - collect arbitrary properties about a digital resource.
* [Resource Physical Status Annotation](/features/discovery-and-stewardship/overview/#capturing-measurements) - documents the physical characteristics of a resource.
* [Fingerprint Annotation](/features/discovery-and-stewardship/overview/#resource-profiling) - Capture the characteristics of the an aspect of the digital resource and express it as a single value.
* [Request for Action Annotation](/features/discovery-and-stewardship/overview/#requesting-stewardship-action) - used to trigger governance and stewardship actions.
* [Relationship Advice Annotation](/features/discovery-and-stewardship/overview/#relationship-discovery) - document a recommended relationship that should be established with the asset.
* [Quality Annotation](/features/discovery-and-stewardship/overview/#calculating-quality-scores) - document calculated quality scores on different dimensions.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<!-- SPDX-License-Identifier: CC-BY-4.0 -->
<!-- Copyright Contributors to the Egeria project. -->

This [open discovery service](/concepts/open-discovery-engine) is described in an [open discovery engine](/concepts/open-discovery-engine). The engine is configured to run in an [Engine Host](/concepts/engine-host) running the [Asset Analysis OMES](/services/omes/asset-analysis/overview) service. The Asset Analysis OMES is configured with the network address of a [Metadata Access Server](/concepts/metadata-access-server) running the [Discovery Engine OMAS](/services/omas/discovery-engine/overview) service.
This [survey action service](/concepts/survey-action-service) is described in an [survey action engine](/concepts/survey-action-engine). The engine is configured to run in an [Engine Host](/concepts/engine-host) running the [Survey Action OMES](/services/omes/survey-action/overview) service. The Survey Action OMES is configured with the network address of a [Metadata Access Server](/concepts/metadata-access-server) running the [Asset Owner OMAS](/services/omas/asset-owner/overview) service.
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@
Once installed in the engine host, the survey action service can be called either by:

* via an [engine action](/concepts/engine-action), or
* via a [governance action type](/concepts/governance-action-type), or
* via a [governance action process](/concepts/governance-action-process).

Each time the survey action service starts, the Survey Action OMES creates a new [Survey Report](/concepts/survey-report) via a call to the Stewardship Action OMAS. As the survey action service runs, it is retrieving metadata, and storing annotations, via its [survey context](/concepts/survey-context). The Survey Action OMES routes these requests to the Stewardship Action OMAS which has access to the open metadata repositories.
Each time the survey action service starts, the Survey Action OMES creates a new [Survey Report](/concepts/survey-report) via a call to the Asset Owner OMAS. As the survey action service runs, it is retrieving metadata, and storing annotations, via its [survey context](/concepts/survey-context). The Survey Action OMES routes these requests to the Asset Owner OMAS which has access to the open metadata repositories.


0 comments on commit f2f2812

Please sign in to comment.