Replies: 3 comments 4 replies
-
After sleeping on it and taking into consideration your feedback in similar discussions, I believe that I need to switch my point a bit. Based on my comments, the goal would be to allow flexibility at the user level in terms of establishing definitions of Entities and their properties. However, this goes completely against defining a standardization generic enough for most of the use cases, which I believe is the point of OpenMetadata and a lacking pillar in the ecosystem. Therefore, I'd like to rephrase the idea of not asking for any specific endpoint or feature that allows for individual customisation, but rather to provide proper documentation on the implementation of OpenMetadata so that we can test the changes and more actively contribute to the Metadata standard being sought. Many thanks, |
Beta Was this translation helpful? Give feedback.
-
I believe the first version of a Model entity could be as follows: {
"$id": "https://open-metadata.org/schema/entity/data/model.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Model",
"description": "This schema defines the Model entity. Models are algorithms trained on data to find patterns or make predictions.",
"type": "object",
"properties" : {
"id": {
"description": "Unique identifier of a model instance.",
"$ref": "../../type/basic.json#/definitions/uuid"
},
"name": {
"description": "Name that identifies this model.",
"type": "string",
"minLength": 1,
"maxLength": 64
},
"fullyQualifiedName": {
"description": "A unique name that identifies a model.",
"type": "string",
"minLength": 1,
"maxLength": 64
},
"displayName": {
"description": "Display Name that identifies this Model.",
"type": "string"
},
"description": {
"description": "Description of the model, what it is, and how to use it.",
"type": "string"
},
"algorithm": {
"description": "Algorithm used to train the model",
"type": "string"
},
"dashboard" : {
"description": "Performance Dashboard URL to track metric evolution",
"$ref" : "../../type/entityReference.json"
},
"href": {
"description": "Link to the resource corresponding to this entity.",
"$ref": "../../type/basic.json#/definitions/href"
},
"owner": {
"description": "Owner of this model.",
"$ref": "../../type/entityReference.json"
},
"followers": {
"description": "Followers of this model.",
"$ref": "../../type/entityReference.json#/definitions/entityReferenceList"
},
"tags": {
"description": "Tags for this model.",
"type": "array",
"items": {
"$ref": "../../type/tagLabel.json"
},
"default": null
}
},
"required": ["id", "name"]
} As a second step, I believe it would be interesting to gather the requirements to define a In each Feature, we could define its origin (e.g., @harshach, what would be the steps to move this forward? More than happy to discuss any changes and looking forward to your input. Thanks |
Beta Was this translation helpful? Give feedback.
-
@pmbrull This looks great. Can you open a PR against main. We can provide comments if necessary |
Beta Was this translation helpful? Give feedback.
-
Hi team 🤗
First of all, let me drop a brief line on thanking you and congratulating you on your work. This looks like an amazing project and I can see it become the de-facto metadata management tool. Having this level of transparency on how to ingest the data and the automation possibilities with the REST API, ticks a lot of boxes.
One idea/requirement I'd love to share would be having more flexibility on dynamically generating and updating new Entities. Although we currently have available a lot of them (and in my opinion, treating a Pipeline as a proper Data Asset is something similar tools are missing...), I feel that it is hard to get definitions right the first time.
I would assume that not everyone needs to model the same assets, and not all projects will have the same requirements inside each asset. One example here would be being able to extend the
Pipeline
Entity definition to include asource
and asink
, or to create completely new Entities such asModel
, to track Machine Learning related aspects.I believe that the key point here is that necessities evolve and requirements are different for different teams. Therefore, a great addition would be having a clear approach on how to perform these kinds of customisations.
Based on this Slack discussion, it seems that we can directly go to the source code and update the JsonSchema. This is a great starting point and my only question here would be how this approach would manage the addition of new Entities, such as the
Model
.What might feel a bit cleaner and allow for better interaction with an already ongoing Metadata project, would be being able to create and update entities directly through the REST API. But again, if we could just be able to customise the Entities even in the source code, that would already be great!
More than happy to discuss 🌻
Thanks again,
Pere
Beta Was this translation helpful? Give feedback.
All reactions