Reference implementation of an AI model catalog in Backstage.
This repository contains two examples, based on real model servers that have been deployed:
-
- A vllm-based single model service running IBM's granite-8b-code-instruct model, with 3scale acting as an API gateway
-
- An ollama-based multi model service running a variety of LLMs.
Each example has Backstage catalog-info.yaml
file provided with it, that represents the model server and model(s) as Backstage catalog types for import into RHDH/Backstage. Each catalog has corresponding techdocs with it, that provide documentation for the model server and model(s).
For more information on the structure of the model catalog, see below
In this catalog:
- Each model server is represented as a
Component
with typemodel-server
, containing information such as:- Name, description URL, authentication status, and how to get access
- Each model deployed on a model server is represented as a
Resource
with typeai-model
, containing information such as:- Name, description, model usage, intended tasks, tags, license, and author
- An
API
object representing the model server API type (e.g. OpenAI, OpenVINO, etc) - Each
model-server
ComponentdependsOn
:- The 1 to N
ai-model
resources deployed on it - The
API
object associated with the model server
- The 1 to N
The following metadata is stored for each model in the catalog:
Name | Type | Description | Catalog Implementation |
---|---|---|---|
Name | String | The name of the model. | Resource metadata.name |
Approved Tasks | String[] | The intended usecases and tasks for the model. | Resource metadata.tags[]. Can prefix task specific tags to highlight e.g. task-text-generation , or task#text-generation . |
Descripion | String | A brief description about the model. | Resource metadata.description |
Type | String | The type of model being stored in the catalog. | Resource techdoc or tags |
License | URL | The license that the model uses. | Resource metadata.links[] or techdoc |
Tags | String[] | Descriptive labels for the model to aid in filtering. | Resource metadata.tags[] |
Author | String | The author of the model. | Resource metadata.tags[] |
Maintainer | String | The maintainer of the model deployed on the model server. | Resource spec.owner |
Instructions | Techdoc | Instructions on how to access / use the model. | Component techdocs |
Download Link | URL | A link to download the model's files (e.g. GGUF artifacts). | Resource metadata.links[] |
The following metadata is stored for each model server in the catalog:
Name | Type | Description | Catalog Implementation |
---|---|---|---|
Model Server Type | String | The type of model server API that the model uses. | API |
Authentication Required? | Boolean | Authentication status for the server. | Component techdocs |
API Link | URL | A link corresponding to the API endpoint for the model service that has the model deployed. | Component metadata.links[] |
Access Link | URL | A link to access the model if hosted online. | Component metadata.links[] |
API Schema | String | The API schema for the model server. | API spec.definition |
Access Instructions | Techdoc | How to get access to and how to use the model server. | Component techdoc |
You can find helpful information related to creating a model server and/or creating models in our tutorials found in /tutorials. Contained within that directory is information for models as well as model servers.